Tech and T-Accounts

TEDx Greensboro

TEDx Greensboro is a local event of TED Talks (TED = Technology, Entertainment, Design).  I was fortunate to attend this inagural event in Greensboro (attendence was choosen through lottery).  The theme was 'Dreamsboro' -- with the idea that presentations would focus on actions that make Greensboro, NC a place where folks want to live, work, play, retire, and do other good things.

In the first few minutes of the TEDx event, the moderator, Justin Catanoso asked that cell phones, etc. be turned off.  So, in the first few minutes, there wasn't much activity.  Then, slowly, folks couldn't contain their enthusiasm any longer and the tweeting began in earnest.

The viz disects the words used by the Tweeters as well is data about who tweeted, which speakers were mentioned, what tags were used.  A user has the ability to see the Tweeters' profile, the tweet itself or open links mentioned within a tweet. 

Now, let's talk about how this viz was built --

First, it was inspired by Andy Cotgreave's efforts that culled Tweets and used at the Tableau Customer Conference 2012, as well as other conferences (directions found here).  To obtain the raw data, I installed Python and ran the Python script to create a CSV file of the tweets.  That was the easy part.

I wanted to extend Andy's approach so that I could use Tableau 8's new word cloud viz.  To do that, I needed to split the tweets into separate words.  Likely this could have been done in Python, as likely could have the rest of the tasks, but I used what I knew well -- Excel.  I created a macro that splits the tweets into words - each word placed into a separate column on the same line as the tweet.  What did I have now?  The basic structure needed to use the Tableau Reshaper Tool Add-In (found here).  This then gave me a row for each word, along with the full tweet info.

Next, I found that many words were noise.  Adjectives, pronouns, mis-spellings, etc. existed.  A little manual effort identified these noise words and set them aside in a separate worksheet tab.  I then connected Tableau public to the Excel, created a join condition that excluded rows with the noise words.  From there it was just a matter of creating the word clouds, tweet timeline, and some parameter driven choices for how to present the data.

Did I mention that all the tools used for data acquisition, reshaping and vizualization are free?  Python, Tableau Data Reshaper, and Tableau Public will handle up to ONE MILLION rows of data!  You can use the above guidance to recreate a viz of your own against your own media event.  I entered this into a 2013 Tableau User Conference contest, please tweet about the viz using the hashtag #TableauTEDxGSO.

Hope you enjoy!


SAS ODBC and Tableau

SAS ODBC Driver setup instructions for creating a DSN that can be used with any ODBC reading software (like Excel, Access, Tableau):

I worked with Robert Morton of Tableau Software for the past year to get this SAS connectivity working and debugged several issues. Here are some notes about what I found out:

- 32 BIT ODBC drivers from SAS need to be installed on the local PC. A SAS license is not required to use this driver.
- Set up a DSN using windows to establish a connection through the SAS ODBC driver (you must use the 32 bit version, so to get the ODBC setup for the 32 bit, and you have a 64 bit operating system, don't open the data connections from Control Panel as that will create a 64-bit connection.  To open a 32 bit connection (at least for Windows 7), place this into your start menu search C:\Windows\SysWOW64\odbcad32.exe. To eliminate the concern whether the connection issues are Tableau related, test your DSN setup using a product that SAS ODBC does recognize, like Microsoft Excel or MS Access. This will enable you to have a conversation with SAS technicians without muddying the issue with Tableau.
- Once the DSN connectivity works (you will need to set up different ones for SAS/SHARE versus locally accessed SAS7bdat files), then connect with Tableau.
- While you can connect live, date fields are not properly interpreted unless it is an extract. Explore this and test!
- When setting up to connect to SAS/SHARE, the SAS instructions are somewhat obtuse. Names of the SAS/SHARE as recognized by the server must be included in your services file and match the DSN Server name. Here is the setup (with some names modified to maintain confidentiality):

Setup Windows so it will know where to route the port:

c:\windows\system32\drivers\etc\services (be sure to insert this line in numeric ascending order of the port #'s)
share 1624/tcp #SAS odbc

DSN setup:
General tab:
   Data Source Name: My SAS connection (this is freeform)
   Description: (can be left blank)
   Server: share.share
   Records to Buffer: 100000 (I picked more than Robert mentions because I'm dealing with large datasets)

SQL Options to check:
   Support VARCHAR
   Disable_0 override parsing
   Fuzz numbers at 12 places
   Infer type in min/max functions

Servers Tab:
   Name:share.share (must be the same as on General tab)
   Password: (left blank)
   Configure button:
   Server Address: servername (this is the actual server name that is used within SAS to connect -- no // or other UNC conventions, just the name)
   UserName: myproduser (we use a Unix user name that is for production reporting access, but whatever user you use, it must have access to the server named above)
   UserPassword: xxxxxxx (we enter the myproduser password so that it doesn't have to be entered each time that you want to access sas/share)

Libraries tab:
   Name: Mylibrary (this is freeform)
   Host File: /export/appl/theexposeddirectory/ (beginning and ending slash are important -- the user must have access to this directory, which is where the sas7bdat files should reside).

Tableau specifc data connection file (only for Tableau versions 6.14 and before)
You may also need to create a tdc file to be placed in the "My Tableau Repository\datasources" subdirectory, but generally this isn't required for version  6.1.4  and after. Create genericodbc-sas.tdc with a text editor with the following content:

<connection-customization class="genericodbc" enabled="true" version="7.7">
<vendor name="SAS"/>
<driver name="SAS"/>
<customization name="CAP_CREATE_TEMP_TABLES" value="no"/>
<customization name="CAP_ODBC_USE_NATIVE_PROTOCOL" value="yes"/>
<customization name="CAP_QUERY_BOOLEXPR_TO_INTEXPR" value="yes"/>
<customization name="CAP_QUERY_GROUP_BY_ALIAS" value="no"/>
<customization name="CAP_QUERY_GROUP_BY_DEGREE" value="yes"/>
<customization name="CAP_QUERY_JOIN_ACROSS_SCHEMAS" value="no"/>
<customization name="CAP_QUERY_JOIN_REQUIRES_SCOPE" value="no"/>
<customization name="CAP_QUERY_SUBQUERIES" value="yes"/>
<customization name="CAP_QUERY_SUBQUERIES_WITH_TOP" value="no"/>
<customization name="CAP_QUERY_TOP_N" value="no"/>
<customization name="CAP_SELECT_INTO" value="no"/>
<customization name="CAP_SELECT_TOP_INTO" value="yes"/>
<customization name="CAP_SKIP_CONNECT_VALIDATION" value="yes"/>
<customization name="CAP_SUPPRESS_DISCOVERY_QUERIES" value="yes"/>
<customization name="CAP_ODBC_FORCE_SINGLE_ROW_BINDING" value="yes"/>
<customization name="CAP_ODBC_BIND_SUPPRESS_INT64" value="yes"/>

Other resources that might be useful:

 Howard Plemmons, SAS Institute paper 156-28, V8 SAS help on ODBC, v9.3 SAS help on ODBC, SAS Help on ODBC:


Oracle SQL - check for no table, no rows

I was recently tasked with an interesting issue.  The report that I created connected to a database table that would be refreshed regularly.  During the refresh process (sometimes taking minutes, sometimes longer), the report would fail to show any data, since the table was empty at that exact moment.  I needed a means to determine if the report had no rows, then I would display a message that the data was being updated.  The report was then to be migrated to another database, where the table didn't even exist yet.  In this case, I wanted to also display the same wait message.  Oracle doesn't have an obvious method to determine if a table exists.  Attempting to run Oracle SQL against a non-existent table results in an Oracle 904 error. 

I figured out that this code will do the trick, whether the table exists or not and whether the table has any rows in it or not.  It returns a binary 1 if the table doesn't exist or doesn't have any rows in it, otherwise a zero is returned.

   CASE WHEN (select num_rows from all_tables where table_name = '<your table name>') IS NULL
   THEN 1
   ELSE 0
from dual


Page 6 of 7 << < 1 2 3 4 5 6 7 > >>
Copyright© Brad Earle
Site Map | Printable View | © 2008 - 2022 KB Earle Associates LLC | |