Tech and T-Accounts

PowerShell install version 4

PowerShell 4.0 added many features, but the one that mattered to me was the -append flag when exporting to csv. With this flag, if the csv file exists, then the rows of info are appended, if the file doesn't exist, then the file is created. While this could be done in older versions, that simple logic basically meant writing code to trap the file exist issue and then branch to the right export command.

So, the question becomes, how do you upgrade to PowerShell 4.0? Do you even have PowerShell 4.0? How do you tell what PowerShell version you have?
To determine the PowerShell engine version, use $PSVersionTable.PSVersion at the PowerShell command prompt. If this command doesn't work from the PowerShell command prompt, then the installed PowerShell is version 1.

Not sure whether PowerShell is installed because it doesn't show up in the Start menu? Well, you could check the registry -- HKEY_LOCAL_MACHINE\Software\Microsoft\PowerShell ... the version would come after that. Microsoft further explains the expected registry entries in this blog: https://blogs.msdn.com/b/powershell/archive/2009/06/25/detection-logic-poweshell-installation.aspx

Me, I prefer working with the Windows PowerShell ISE, but the above command can also run from the PowerShell command line (kinda looks like a DOS window). Both options should show up by typing in PowerShell from the Start button then selecting the verison you want -  Windows PowerShell ISE or Windows PowerShell (command line version).  If you like the ISE like I do, then from the Start button type in ISE to get to the two versions (x86 which I don't use, and Windows PowerShell ISE (my preference). 

Here is the link from Microsoft -- http://www.microsoft.com/en-us/download/details.aspx?id=40855. Installing version 4 of PowerShell also updates features of the Windows Management Framework to version 4.0.

Another tidbit -- even with PowerShell installed, PC's typically are shipped with 'Execution Policies' not enabled for scripts to be run.  To modify this setting, a regedit is required. 

run PowerShell as administrator, then run the following command: Set-ExecutionPolicy RemoteSigned

Update - install Active Directory for PowerShell module in order to query Active Directory
Easiest way to install AD module is to run the following PowerShell command for Windows 8 and prior:  Add-WindowsFeature RSAT-AD-PowerShell

For Windows 10, download a package and install it with elevated privileges - instructions found here.

Further info on AD module installation: https://www.microsoft.com/en-us/download/details.aspx?7887 is where you'll find the Remote Server Administration for Windows 7 download, then follow the instructions found here - https://4sysops.com/archives/how-to-install-the-powershell-active-directory-module/

Gravatar

Foraging through Teradata

I have been asked to find data in a Teradata data warehouse without being provided any contextual 'sherpa guide'.  Fortunately, the requested person is familiar with the data that they seek and already has sourced a vetted version of the data through other means.  When that person then wants the data to come from a Teradata connection, it is typically because the vetted source doesn't have a convienent means to enable Tableau to connect to the data directly, but the requesting party knows that the data resides in the data warehouse and has obtained proper permissions to read the data directly.

After securing permissions and having knowledge of what the 'right' data should look like, I then connect to Teradata only to find something like 1 million unique Database/Table/Column combinations.  To assist my finding the data, I need to create my own sherpa guide.  Fortunately, DB's that run Teradata DW typically have instilled common naming conventions to their databases/tables/columns and rarely have obfuscated those items.  Connecting Tableau to the DBC database and the columsv view provides a comprehensive list of databases, tables and columns that then can be filtered using wildcard matching to find potential data sources for the requested data.

After finding potentials, creating a connection to each found db/table enables the data discovery (and lets you know if you have permissions to see that particular db/table/column).  The last step is to verify the quality of the data to the known vetted source.  With the last step complete, going further to actually create data visualization stories with Tableau can begin.

Gravatar

Tableau Server: Extracts, Data Connections, Data Sources

A post on the Tableau Linked In forum got me thinking about how to explain about Tableau Server (version 6-8) Data Sources.  A Data Source published to Tableau Server is a convenient method to share the data connection instructions (meta-data) including dimensions, measures, sets and any calculated fields.  Tableau covers some of the issues related to Data Sources in its knowledge base article found here.

But, the differences are quite perplexing and subtle, and which combination of tactics is used depends on the speed of the actual data repository connection and the reason for using Tableau server access rights to expose a data source rather than a file based solution (save a .TDS file to a network drive).  Below is an explanation of the terms as well as insights into considerations of the various elements.

Extracts - A copy of the data is placed in a proprietary database engine.  Use extracts when the direct connect to the data is cumbersomely slow, the data is a snapshot, or the features of the underlying data source lack functionality (like the MS Jet engine)

Data Connection -- When authoring a workbook, data connections are created to obtain data from a specific data repository.  Repetitive data connections can be made to the same data repository (a global filter applies to a specific data connection.  A use case for repetitive connections is when there are some worksheets created that use a global filter and others in the same workbook that shouldn't have the global filter applied).  A data connection is a SQL statement that is comprises of either a single table, multiple tables with join conditions or custom SQL.

Data Source -- This is a Tableau term for the xml instructions that can be saved either to a file or to the server.  The actual data is not contained in these files or loaded to the server. Data Sources can include instructions about how and where an extract can be found. A data source only includes the metadata that tells Tableau how to create the data connection and what formulas and native data elements exist in that data connection. It does not include formatting.

You can make it seem that the data is on the tableau server. When an extract is saved to a network location (use the UNC filename, not mapped drives) that the server and user can also access, then when a user selects 'Import Data Source' from the server, the data, dimensions, measures and sets appear in the workbook because the user has access to the data extract that is defined in the data source, not because the data resides on Tableau Server.

Version 6 permits saving extract files to a network drive, while prior versions required extracts to reside on a local drive.  Extracts on a network drive can impact performance.

Publishing Data Sources to a server is a method to share a common set of formulas, etc. with other server users (the publisher chooses who to give permissions).  If the connection requires database permissions, then those credentials will have to be entered by the user to access the actual data.  However, come caveats: 
1. Read only vs. update permissions to the extract located on a network drive have to be managed by network drive access permissions. 
2. If the network extract file is open by any process, then the extract cannot be updated.  A network drive extract file can have several workbooks read the file at the same time.
3. A workbook with update file rights can add dimensions, measures, sets to the data source and re-generate the extract.  The new dimensions aren't shared, however, unless the data source is re-published to the server.  When other workbooks open, they see a message "A newer version of the published data source is available.  Would you like to use the latest version?"  Answering no means you get to the data source as before, but use the old XML definition that was in the workbook. Some elements could be broken in your workbook due to renamed or removed items from the extract file. Answering yes brings in the new elements.

If the data connection was saved with instructions to use it as an extract, then a popup appears to either:
 1. locate the  the extract (which could be on a network drive)
 2. remove the extract (it will then require appropriate permission to use the data connection)
 3. Deactivate the extract (keeps the extract info in the xml file, but doesn't use that information -- instead requires appropriate permission to use the data connection)  If the extract file didn't exist before, an error will occur when attempting to regenerate the extract.
 4. Regenerate the extract (careful!  this regenerates the extract in the same location where it was originally generated - so to do this you will need permission to use the data connection and permission to save the extract wherever it was - it shows you a shortened version of the save location, but you don't have a choice to change this location)

So, lets complicate it some more:
When you publish a workbook to the server and it uses an extract, the extract is pulled from the network/local drive and placed on the server as part of a packaged workbook.  This is a copy of the extract and not the local/network located data source.  So, if the loc data source is modified and saved to the network, the Tableau server version of the workbook doesn't pull from that location, instead it pulls from the extract copy.  The meta-data about the data source is frozen as of the time that the server version of the workbook was published.  To get the server version to recognize changes, the workbook needs to be re-published with the updated data source information to be embedded into a packaged workbook.

I recently had an issue where I needed to have both a live connection to an Excel workbook and a MS SQL Server connection as a data extract in the same workbook.  By saving the workbook as a twb file, the live connection to the local file remained active.  If I saved the workbook as a twbx, then a copy of the Excel file was embedded in the workbook and the live connection was lost.  So, here comes the tricky part.  I wanted to publish this workbook to Tableau Server with a live connection to the Excel file.  The problem happens when publishing, Tableau automatically 'packages' the workbook for consumption to the server when there is an extract present.  Packaging is another term for what happens when a twbx is created.  So, when I published the workbook to the server, the Excel file connection no longer pointed to the local resource (I used UNC naming convention to point it to a network share that the Tableau server could access).  To overcome this issue, I had to publish the Excel workbook as a datasource to the Tableau Server -- but unchecking the box 'Include External Files', thereby making the datasource a live read to the Excel file.  Last, I created a new data connection to the this Tableau server datasource, then replaced the original Excel Datasource with this new one.  Now, when I publish the workbook to the server, the live connection doesn't break, as it is defined by the published datasource, not by the workbook.  Even as a packaged workbook, the live link remains working.

Gravatar
Page 5 of 7 << < 1 2 3 4 5 6 7 > >>
Copyright© Brad Earle
Site Map | Printable View | © 2008 - 2022 KB Earle Associates LLC | |