Friday, February 28, 2014

Power Pivot memory consumption

"Memory error: Allocation failure : Not enough storage is available to process this command. . If using a 32-bit version of the product, consider upgrading to the 64-bit version or increasing the amount of memory available on the machine.

http://sqlblog.com/blogs/marco_russo/archive/2010/01/26/memory-considerations-about-powerpivot-for-excel.aspx

Memory Considerations about PowerPivot for Excel

PowerPivot for Excel is an Add-In which uses a local version of Analysis Services (SSAS) to process data and make calculation to respond to user queries made using Excel 2010. The SSAS engine is loaded in-memory in the process of Excel. Especially with a 32 bit of Excel, there are particular considerations about the use of the memory.
When an Excel file containing PowerPivot data is open, the memory consumption is not affected by the PowerPivot volume of data. These data are loaded only when the PowerPivot window is opened or when the PivotTables and/or PivotCharts based on PowerPivot data are updated.
Loading PowerPivot data into Excel requires two steps:

The SSAS backup contained into the Excel file is restored into a temporary table (a folder named with an IMBI_ prefix is created in the C:\Users\<<USERNAME>>\AppData\Local\Temp directory). This requires disk space sufficient to restore these files, which are usually 2 to 3 times the size of the Excel file.
The SSAS database, once decompressed, is also loaded in virtual memory. Thus, as a rule of thumb, loading and browsing data of an Excel file containing PowerPivot data requires as much memory as 2 to 3 times the size of the Excel file.

The memory available for PowerPivot in a 32 bit version of Excel is just above 1Gb (it is lower than the virtual memory addressable space). For this reason, is it not so hard to get this error during the process of a large set of data:

Memory error: Allocation failure : Not enough storage is available to process this command. .
The operation has been cancelled.

The best way to avoid this error is using a 64 bit of Excel. However, there are many reasons for not having a 64 bit of Excel even if the operating system is a 64 bit one. The main reason, in my personal case, is the use of Outlook AddIns that are not available for the 64 bit version and it is not possible to mix 32 and 64 bit of Office products on the same machine. Therefore, it is a good idea trying to optimize the PowerPivot data model in a way that makes a better use the available memory.

Reduce the number of rows

This is probably a useless suggestion. Reducing the number of rows, the size of the database is reduced too. If you are extracting data from a database and you don’t need the full detail during the analysis, you might import data at the cardinality really required by your analysis. For example: if a table contains all the sales transactions of a retail store, but you only need to analyze data at the month level, you might group data by month, reducing the number of rows to be loaded. Most of the times, this optimization is not possible, because the user doesn’t want to lose the granularity of analysis.

Reduce the number of columns

This is by far the most important suggestion. If you don’t need a column, don’t import it!
PowerPivot uses a column-oriented database technology. Each column has its own storage and indexed structure. Each column increases processing time and memory required for both processing and execution.
An important concept is that every column imported is relevant. Even if PowerPivot internally uses an Analysis Services engine, it is a particular type (in-memory) that doesn’t make distinctions between quantitative measures and qualitative attributes. Thus, a SalesQuantity column is considered as both a dimension attribute and a measure.

Optimize column data types

Not every column has the same weight for memory and processing time. A column with few distinct values will be lighter than a column with a high number of distinct values. As we said, this is important also for measures, which are considered also possible quantitative attributes. If the measure you are storing is a float and is the result of a calculation, you might consider reducing the number of digits to be imported. This will reduce the size of the dictionary, and possibly also the number of distinct values.

Columns with string data types

PowerPivot stores a dictionary of all the unique strings in each attribute. Thus, having a long string is not a problem by itself, nor having a few very long strings if the number of distinct values is low. The point is that the average length of a string in an attribute multiplied by the number of unique values of the attribute is the measure of the size required to store this dictionary. For this reason, importing a long description that is different for every transaction is not a good idea.

Avoid high-cardinality columns

Columns that contain a very high number of distinct values are very expensive for PowerPivot. For example, the Invoice ID in a Sales Transactions table is very useful for the end user when it is necessary to drill-down at a transaction level. However, you have to be aware that this single column might be the most expensive one of your PowerPivot dataset, especially during the process phase.

Consider Calculated Columns

Each calculated column is stored into the PowerPivot data model just as it was an imported column. This is not true for calculated measures, which are calculated at query time. If the same calculation can be defined as either a calculated column or a calculated measure, the latter is a better solution from a resource consumption point of view. However, a calculated measure loses the ability to navigate into its values as an attribute like you can do using a calculated column.
If you have to store a measure as a calculated column, consider reducing the number of digits of the calculation: you can use the DAX function namedROUND just for this purpose.

Normalize data

Normalizing data doesn’t have a big effect on the size of the resulting database. However, it might have a strong impact on both processing time and memory required to process data.
The key is to find a right balance. A full denormalized table, which is the case of a single SQL query that denormalizes all of the attributes resulting in a PowerPivot dataset made of a single table, has a long processing time and requires more memory to be processed. In fact, during the process PowerPivot holds 1-2 millions of rows in a buffer and, if the rows have are large, this restricts the memory available to store data.
At the same time, a complete normalization of data, like the one of a third normal form, could be not a good idea. The number of relationship increase very much and it makes necessary having many columns in the model just for technical reason (defining relationships), but they are of no use for the end user. The resulting model is much more complex and is not faster than a balance between these two extremes.
The right balance is the one offered by a typical star schema. Putting all the attributes of a single entity into one table, just like a dimension in a Kimball star schema, appears as the best tradeoff also for PowerPivot. The resulting model is also easier to navigate for the end users.
This level of normalization usually offers the best processing times, some saving in storage size and a better use of memory during processing of data (the phases of import and refresh data).

Friday, February 21, 2014

VBA Data Types

Monday, February 17, 2014

Show/Hide MS-Access 2007 Navigation Pane

See also:

http://msdn.microsoft.com/en-us/library/bb256564(v=office.12).aspx
http://support.microsoft.com/kb/826765/en-us

Put this code into a Module of your Access DB.

Public Sub Secure_database()
' this will require that there is a visible form already displayed!!! With CurrentDb .Properties("AllowShortcutMenus") = False .Properties("AllowFullMenus") = False' .Properties("AllowBreakIntoCode") = False .Properties("AllowShortcutMenus") = False .Properties("AllowSpecialKeys") = False .Properties("StartupshowDBWindow") = False End With DoCmd.Save DoCmd.CloseDatabase End Sub
Public Sub UnSecure_database()
With CurrentDb .Properties("AllowShortcutMenus") = True .Properties("AllowFullMenus") = True '.Properties("AllowBreakIntoCode") = True .Properties("AllowShortcutMenus") = True .Properties("AllowSpecialKeys") = True .Properties("StartupshowDBWindow") = True End With DoCmd.Save DoCmd.CloseDatabase
End Sub

Function ap_DisableShift()'This function disable the shift at startup. This action causes'the Autoexec macro and Startup properties to always be executed.
On Error GoTo errDisableShift
Dim db As DAO.Database Dim prop As DAO.Property Const conPropNotFound = 3270     Set db = CurrentDb()     'This next line disables the shift key on startup. db.Properties("AllowByPassKey") = False     'The function is successful.Exit Function
errDisableShift: 'The first part of this error routine creates the "AllowByPassKey 'property if it does not exist. If Err = conPropNotFound Then Set prop = db.CreateProperty("AllowByPassKey", dbBoolean, False) db.Properties.Append prop Resume Next Else MsgBox "Function 'ap_DisableShift' did not complete successfully." Exit Function End If
End Function
Function ap_EnableShift() 'This function enables the SHIFT key at startup. This action causes 'the Autoexec macro and the Startup properties to be bypassed 'if the user holds down the SHIFT key when the user opens the database.         'If you want to disable the SHIFT key, type ' ap_DisableShift in the Immediate window, and then press ENTER. 'If you want to enable the shift key, type ' ap_EnableShift in the Immediate window, and then press ENTER.
On Error GoTo errEnableShift     Dim db As DAO.Database Dim prop As DAO.Property Const conPropNotFound = 3270     Set db = CurrentDb()     'This next line of code disables the SHIFT key on startup. db.Properties("AllowByPassKey") = True     'function successfulExit Function
errEnableShift: 'The first part of this error routine creates the "AllowByPassKey 'property if it does not exist. If Err = conPropNotFound Then Set prop = db.CreateProperty("AllowByPassKey", _ dbBoolean, True) db.Properties.Append prop Resume Next Else MsgBox "Function 'ap_DisableShift' did not complete successfully." Exit Function End If
End Function

Wednesday, February 12, 2014

Find column name in database

SELECT name, column_id, * FROM sys.columns
where object_ID = (
SELECT a.object_id --b.name as 'schema_name', a.name, a.object_id, a.schema_id, a.parent_object_id, a.type, a.type_desc, a.create_date, a.modify_date
FROM sys.objects a INNER JOIN sys.schemas b ON a.schema_id = b.schema_id
WHERE object_id IN (select object_ID from sys.columns where name like 'snapshot%')
and b.name = 'snapshot'
and a.name = 'T2_Flatfile_snapshot_level'
--ORDER BY a.[type], b.name, a.name
)

select *
from sys.objects
where object_id IN (select object_ID from sys.columns where name like '30Day%')

--1760985600
SELECT b.name as 'schema_name', a.name, a.object_id, a.schema_id, a.parent_object_id, a.type, a.type_desc, a.create_date, a.modify_date
FROM sys.objects a INNER JOIN sys.schemas b ON a.schema_id = b.schema_id
WHERE object_id IN (select object_ID from sys.columns where name like 'snapshot%')
and b.name = 'snapshot'
and a.name = 'T2_Flatfile_snapshot_level'
ORDER BY a.[type], b.name, a.name

--823934257
SELECT b.name as 'schema_name', a.name, a.object_id, a.schema_id, a.parent_object_id, a.type, a.type_desc, a.create_date, a.modify_date
FROM sys.objects a INNER JOIN sys.schemas b ON a.schema_id = b.schema_id
WHERE object_id IN (select object_ID from sys.columns where name like 'snapshot%')
and b.name = 'dbo'
and a.name = 'Flatfile_snapshot'
ORDER BY a.[type], b.name, a.name

SELECT name, column_id FROM sys.columns where object_ID = 109503719
order by 2
union all
SELECT object_id, name, column_id FROM sys.columns where object_ID = 823934257
order by 2

Wednesday, February 5, 2014

The binding status was "DT_NTEXT". The data flow column type is "DBBINDSTATUS_UNSUPPORTEDCONVERSION".

The problem is that the data in that column is longer than 255 characters. If the Jet driver (the code that reads the Excel file for SSIS) detects content in that column larger than that, it presents the data to SSIS as a DT_TEXT or DT_NTEXT data type. SSIS has no control over that behaviour. Changing the "data type" in Excel doesn't exist - you only change data "format"... and that won't help you here, because it's simply a matter of size.

What you have to do is accept that you're getting a DT_NTEXT in to SSIS, which is a "long" string type - a CLOB. In order to "convert" that to a DT_WSTR, you'll need to use a Data Conversion or Derived Column to create a new column with that data. Make sure the column you make has enough room for the incoming data...