How Many's!!!

free counters

This is default featured slide 1 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.

This is default featured slide 2 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.

This is default featured slide 3 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.

This is default featured slide 4 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.

This is default featured slide 5 title

Go to Blogger edit html and find these sentences.Now replace these sentences with your own descriptions.

Pages

Friday, 17 August 2012

SAP HANA : FAQ

SAP HANA 1.0 went GA this week - SAP speaks for Generally Available, and it seemed like a good time to collect all the facts about SAP's new In Memory business and put it in one place. I hope you enjoy it. It's not an official SAP FAQ, but it is information pulled together from industry experts from various organizations.

1. HANA Overview
1.1 What are the product names?

The short answer is: it's a mystery. SAP has changed them around a lot and now they call it SAP HANA Appliance, SAP HANA Database and SAP HANA Studio. Applications built on HANA will be marked "powered by SAP HANA". Probably they will change it all again.
1.2 What is SAP HANA Appliance 1.0?

SAP HANA 1.0 is an analytics appliance that consists of certified hardware, an In Memory DataBase (IMDB) an Analytics Engine and some tooling for getting data in and out of HANA. You build the logic and structures yourself, and use a tool e.g. SAP BusinessObjects, to visualise or analyse data.
1.3 What are the limitations of HANA 1.0?

Quite a few so far - it can only replicate certain data, from certain databases, in certain formats, using the Sybase Replication Server. Batch loading is done using SAP BusinessObjects Data Services 4.0 and is optimised only for SAP BusinessObjects BI 4.0 reporting.
1.4 What is SAP HANA 1.5, 1.2 or 1.0 SP03?

These are all the same thing, and 1.0 SP03 is touted to be the final name for what should go into RampUp (beta) in Q4 2011. This will allow any SAP NetWeaver BW 7.3 Data Warehouse to be migrated into a HANA appliance. HANA 1.0 SP03 specifically also accelerates BW calculations and planning, which means you get even more performance gains.
1.5 What's the difference between HANA and IMDB?

HANA is the name for the current BI appliance (HANA 1.0) and the BW Data Warehouse appliance (HANA 1.0 SP03). Both of these use the SAP IMDB Database Technology (SAP HANA Database) as their underlying RDBMS. Expect SAP to start to differentiate this more clearly as they start to position the technology for use cases other than Analytics.
1.6 If I can run NetWeaver BW on IMDB/HANA, why can't I run the Business Suite/ERP 6.0?

Simply because it's not mature enough yet to support business critical applications. From a technology perspective, it is already possible to run the Business Suite on IMDB and SAP has trialled moving some large databases into IMDB already.
1.7 What is HANA great at?

The best thing that HANA brings to the table is the ability to aggregate large data volumes in near real-time - and to have the data updated in near real-time. SAP's demos show hundreds of billions of records of data being aggregated in a matter of seconds. SAP has built a set of Analytics Apps on top of HANA and this are set to be great point use cases to get customers up and running quickly.
1.8 What is HANA bad at?

There are some current issues around HANA when delivering ad-hoc analytics, especially when using the SAP BusinessObjects Webi tool. Essentially the problem is that you can ask computationally very difficult questions with Webi, which can cause very long response times with HANA. SAP will need to build optimization for both Webi and HANA to reduce the computational complexity of these questions, but they're not there yet.

What's more, it's worth noting that HANA 1.0 is not a Data Warehouse and it is more of a Data Mart - that is, suited to point applications where there is a clear use case.
1.9 What does HANA cost?

SAP hasn't entirely confirmed HANA licensing costs but the hardware is somewhere around $1-200k per TB. Add to this licensing costs which are still being made on a per-customer basis.
1.10 Why is HANA so fast?

Regular RDBMS technologies put the information on spinning plates of iron (hard disks) from which the information is retrieved. HANA stores information in electronic memory, which is some 50x faster (depending on how you calculate). HANA stores a copy on magnetic disk, in case of power failure or the like. In addition, most SAP systems have the database on one system and a calculation engine on another, and they pass information between them. With HANA, this all happens within the same machine.
1.11 Does HANA/IMDB replace Oracle?

It's the elephant in the room, but once the Business Suite runs on IMDB, Oracle won't be needed any more by SAP customers who purchase HANA. This doesn't affect anything in the short term because those people buying HANA today will still need an Oracle ERP system.
1.12 What is this about 10:1 compression with HANA compared to Oracle?

A typical uncompressed Oracle or Microsoft SQL Server database, when put into HANA, will be 10x smaller than before and this is due to the way that HANA stores information in a compressed format. Note that most databases are now compressed and these numbers may not fit your scenario, and to add to this you need 2x the RAM as your database, plus room for growth. HANA sizing is still a dark art.
1.13 You mean I have to buy a HANA only 2.5x smaller than my big Oracle RDBMS? What about archiving and data ageing?

Yes, in some instances you may have to buy a HANA appliance that is only 2.5x smaller than it would be under Oracle. And data ageing isn't part of the 1.0 release, but SAP is certainly working on it pretty hard. Let's hope they release something faster than you need to buy a bigger HANA appliance!
1.14 What's the wider market opportunity for IMDB?

This is the interesting thing - no one knows yet, and few analysts seem to have cottoned on that the wider market opportunity might be huge. Think not just SAP applications but any third party that requires ultra-high speed. Think not just an appliance but a development platform. Time will tell.
2. SAP HANA database hardware
2.1 What hardware is supported right now?

Talk to your hardware vendor - all of the major vendors e.g. HP, IBM, Dell, have HANA offerings now. Technically HANA will run on any Intel x64 based system from your laptop through to the big 40-core, 2TB RAM servers. It is however only supported on a small number of big rack-mount servers like the Dell R910 and HP DL980.
2.2 Why doesn't HANA run on blades?

It's unclear but probably because the blades don't yet offer the same performance. HANA is optimized for the Intel X7560 CPU and will run fastest on this. And for instance, the Dell M910 blade can only run 2x X7650 CPUs and 512Gb RAM in this configuration, which probably explains the limitations. What's certain is that HANA will eventually run on blades - it's born to run on blade technology!
2.3 Does SAP make their own IMDB/HANA hardware?

Yes, but only in the labs so far. There are no public plans to compete against IBM/HP/Dell in this space, but it may make sense for SAP to enter the appliance market, especially in the context of Data Centres and even more so in the context of the SAP Business byDesign cloud offering, which will run on IMDB.
2.4 How big does HANA scale?

Theoretically at least - very well. The biggest single-server HANA hardware will run most mid-size workloads - 2TB of in-memory storage is equivalent to 5-20TB of Oracle storage. The way that HANA works means that it is possible to chain multiple systems together - meaning that scalability has thus-far been determined by the size of customers' wallets. Do note that whilst SAP talk up "Big Data" quite a lot, HANA currently only scales to the small-end of Big Data, which refers to the kind of huge datasets that FaceBook or Google have to store - not Terabytes, but rather Petabytes.
2.5 What storage subsystem does HANA use?

This varies from vendor to vendor but it is shared network attached storage (NAS). Both regular magnetic disks and SSD storage can be used for the backup of the database (HANA runs in memory remember, so disk storage is just for backup, and later, for data ageing). Note that you require 2x storage that you have RAM, which is 2x the database size - i.e. storage size = 4x database size. In most cases there is additional ultra-high speed SSD storage for log files.
3. Technical FAQ
3.1 What source databases does HANA support in real-time?

If you use Sybase Replication Server (SRS) for near real-time data then you need to watch out for licensing still (SAP have license deals pending). If you run DB2 then you're fine but with Oracle and Microsoft SQL Server there are some license challenges if you buy your license through SAP, because you may have a limited license that does not allow extraction. Talk to SAP for further information on this.
3.2 What source databases does HANA support for batch loads?

If you use SAP BusinessObjects Data Services 4.0 for bulk loads then pretty much anything. BO-DS is a very flexible Extract, Transform & Load tool that supports many databases - check out the specs for more details.
3.3 What additional limitations does Sybase Replication Server present?

SRS has additional restrictions which are worth bearing on mind. It can only replicate Unicode data and does not support IBM DB2 compressed tables.
4. Follow-ons, corrections & credits

This is a work in progress and your help correcting me, clarifying some things I may have not explained so well or even just asking a question that I haven't covered would be really useful for the wider market. Let me know and I'll expand this as the months go on!

FAQ about SAP BI4 WebIntelligence on HANA


As part of Customer Solution Adoption focusing on BI4 analytics suite, my colleague (David Francois Gonzalez) and I started this little FAQ on WebIntelligence on HANA. Some questions may look simplistic and common sense, but we hope this sheds light and sets the right expectations when using these great technologies. We will share more advanced best practices and use cases on this topic at SAP TechEd 2012 (Strategies for SAP BusinessObjects Web Intelligence and the Semantic Layer - AP262). There is also a SAP BI4 Elite Enablement happening in Vancouver in November that will deep dive into all BI4 related topics, including access to SAP BW+HANA (www.elite-enablement.com)

What is this FAQ about?

In the FAQ we are discussing SAP BI4 Web Intelligence connecting to SAP HANA in relational mode through a Universe in SQL. We are not discussing Web Intelligence connecting to SAP BW on HANA.

What is SAP HANA?

SAP HANA is a super-fast in-memory database enabling new possibilities in terms of analytical reporting and real-time data acquisition and consumption.

Is SAP HANA really real-time?

The term “Real-time” can be ambiguous. Let’s say SAP HANA can be super-fast, depending on which context.

Analytical reporting: HANA is designed for on-line analysis. It can calculate and execute millions of data at sub-second speed. Of course it will depend on other factors like creating an optimal HANA model (Analytical Views) and the type of queries you’re sending to HANA. Also bear in mind that HANA will execute queries very quickly but it does not have the control over the amount of data it is fetching (see question on Bad Design).

Data acquisition: HANA is shipped with ETL and replication tools that can load data from operational systems to HANA in a matter of seconds.

Data consumption: HANA solves the issue of having to maintain aggregate tables for performance (data is fresh and live, no need to wait for the next ETL batch).

What is SAP BOBJ Web Intelligence?

Web Intelligence (WebI) is an interactive reporting tool that can access relational as well as multidimensional datasources via the concept of Universes. The reports can be viewed online or offline thanks to the microcube, an embedded local in-memory cache engine. In case of Hana, WebI will access it through relational access.

What are Universes?

A Universe is a metadata layer that allows Designers to map to the underlying data sources into familiar business terms and optimize the queries access to the database (in this FAQ the database would be SAP HANA). This allows business users and analysts to create, access, analyze and share business content easily across the enterprise. Also referred as "Semantic Layer", the Universes allow IT to control access to the data and guarantee its integrity and validity.
 

Is SAP Web Intelligence really real-time ?

In order to display data coming from HANA into Web Intelligence, you will need first to execute the query on the HANA side, transfer the data through the network and load data into the local cube, execute some calculations and display these data into Web Intelligence. Depending of the data volume these actions can take from less than 1 second to much more time.

Why Web Intelligence is using a microcube / local cube ?

The micro cube is mandatory and an important part of the Webi architecture design. Having a microcube has a lot advantages when designing and consuming a Web Intelligence reports, and it has been designed for a vast majority of databases. The main benefit is that it can offer offline analytical capabilities like report viewing, drilling, quick filtering, local calculations:
  •   The WebI calculation compensates missing data source expressiveness (cross table, multi-context evaluation, advanced functions like Previous, …) . It requires to fetch raw data first before performing local data processing
  •   It enables Multi-data Provider support, which requires local data materialization for data synchronisation
  •   It enables data historization (if not available from the data source) which allows users to create data snapshots, and make features like Track Data Change possible.

Moreover it can avoid unnecessary round trips to the database and be very performing if the report is well designed and sized (remember microcube is in local memory). In the context of HANA, designing a Web Intelligence is a matter of balance between performance and functionality (for example Track Data Change will not available if you are using Query on Drill), but the idea is to have the smallest subset of data loaded in the microcube and to push down all the heavy calculations down to HANA.

What is a typical use case for Web Intelligence with SAP HANA ?

In order to best leverage Web Intelligence with SAP HANA, you need to design reports that require intensive aggregation and calculation on huge amount of data that will be processed almost instantly by HANA. Another example is to create analytical reports that can answer a user’s business questions upon live requests, for example a drill-down report, going from summary data to detailed data. In this scenario, SAP HANA will aggregate the results on-the-fly after each user drill interaction, and WebIntelligence will only retrieve the results needed in the report, thus leading to a smaller microcube. You can leverage these types of workflows using the "Query on Drill" feature in WebIntelligence.

How can I best leverage SAP HANA with Web Intelligence

In order to leverage the in-memory capabilities of SAP HANA, all the calculations and aggregations have to be done in SAP HANA for high performance analytical reporting. In this scenario, Web Intelligence should avoid pre-loading too much data into local cache (microcube) as this could cause performance issue due to too much data crossing the network, loading the data into the microcube and displaying the data into Webi. Web Intelligence only needs to retrieve the results calculated by HANA, needed for the report.

What is the best BI tool to leverage HANA

SAP HANA doesn't change the positioning of SAP BI4 tools. You still need to choose the right tool for the right job.

What is the most performing way to access SAP HANA from Web Intelligence.

Web Intelligence can only access SAP HANA through a universe. The most optimal way is to create HANA information views (Analytical View or Calculation Views) and to map them directly into BI4's Information Design Tool. See link "Create Universe on HANA best practices".
For more information, please refer to this must-read document  “Creating a Universe on SAP HANA Best practiceshttp://scn.sap.com/docs/DOC-23256

My Web Intelligence report is connected to SAP HANA and I don't see much performance difference with my previous database.

The value of SAP HANA can be highlighted in analytical workflows. Its performances are outstanding in processing calculations and aggregations. If you're trying to refresh a static Webi report with loads of data, some or insignificant performance gains could be noticed in the query execution phase, but the fetching phase will depend on other factors like the network transfer, the loading of data into the microcube or the rendering of this data into WebI. Just by replacing your old database by SAP HANA will not lead to sub-second performance. You still have to implement BI reporting best practices (e.g., query on drill, query stripping etc…) and Universes/HANA views design best practices in order to fully leverage the power of SAP HANA.

How to quickly analyze the performance of Web Intelligence on HANA?

The performance can be broke down into 3 different parts (HANA Database, Network, and Web Intelligence). The quick and dirty way to check the performance is to install HANA studio on the same box as the BI4.0 system and to compare the query performance using these two tools. Use advanced monitoring tools like Solution Manager / Wily Introscope for a deeper performance investigation.


Time spent in HanaYou have to retrieve the SQL generated into Webi and paste it into HANA studio and run it. In HANA studio you will see the time to execute the request into HANA and the fetching phase to retrieve into HANA studio. Check the execution phase while you execute the query in HANA studio.
Time spent in the Network timeYou need to change the “max displayed rows in result” setting in HANA Studio (maximum is 99999 rows).
Check the fetching phase while you execute the query in HANA studio.
Time spent in WebiCheck the data manager in WebIntelligence (includes execution and fetching phase performance but only shows seconds)
   

What is Query Stripping?

Query Stripping is an automated way to remove ("strip") unnecessary objects from a Web Intelligence query, for example if a report doesn't need those objects to display the correct results. This feature avoids sending too big queries to the database and reduces the size of data transferring across the network.
As of today, this feature is only available for OLAP datasources. Query Stripping is planned to be available for HANA in a future BI4 release. However report designer can a similar technique by simply removing objects from the Query Panel that are not used in the report, thus simulating the Query Stripping behavior.
 

What is Query on Drill and how to use it?

The Query on Drill will send a new query to the database each time a user performs a drill action. As SAP HANA executes queries very fast, the user has instant access to "live" data each time he drills down or up.  This feature helps reducing the amount of data going through the network by pushing down the calculation to SAP HANA and retrieving only the results into the microcube.
On the contrary, by using the default "Scope of Analysis" drill function, the drillable data needs to pre-fetched into the microcube. If a dimension (or characteristic) is not in the scope of analysis (aka not in the microcube), WebIntelligence sends a new query to the database. This new “out of scope” query can lead to some performance degradation in traditional databases if the data volume to be aggregated is huge.  This is why when using traditional databases, DBA's need to create aggregate tables in order to provide acceptable query performance for their end users. The main inconvenient of pre-aggregating data are the maintenance costs, the lack of freshness of data (e.g., some customers perform ETL to update their aggregates once a week or more) and the lack of flexibility (the queries need to be anticipated, and if a user requests data that are not pre-aggregated, this can lead to long waiting times)
With SAP HANA, this technique retrieves each drill step with sub-second performance, regardless of the data volume.


Bad Design - What you should avoid doing using Web Intelligence with any database including SAP HANA ?


Also please note that creating a detailed report containing fine grained data, like a 500-pages invoice or a big detailed operational report, is not mind-blowing use case for HANA. You won’t leverage much of the analytical capabilities of HANA. 
As for any other databases, wide open queries (e.g., select * from Analytic View) are also to be avoided and safety belts should be implemented at universe and HANA level to avoid runaway queries.

And remember that bad design translates to bad performance.
Always remember, when designing BI reports, with any DB fast or slow, that bad design leads to bad performance. Make sure you design your HANA model for optimal performance. Make your BI4 reports answer relevant business questions and avoid querying unnecessary data that you put too much load to the database and network. Do not use BI tools as a data extraction tool!