Posts

Showing posts from 2014

SQL Indexes

Clustered Indexes: Use B-Tree algorithm. Only one index for a table. Clustered indexes are similar to a telephone directory where you search a person's name alphabetically and get his phone number there only. CLUSTERED INDEX SCAN: When table with crusted index is accessed The query doesn't use the non-clustered index in the table. The table does not have any non-clustered index. It will be bad when large data with most columns and rows are retrieved CLUSTERED INDEX SEEK: When Table with clustered index is accessed and query locates specific rows in B+ tree. It will be always good. Even though evaluate the possibility of non-clustered index. Non Clustered Indexes: Use B-Tree algorithm. For SQL 2005 249 Non Clustered Index. It is 999 for SQL 2008. Non Clustered indexes are similar to the Index of a book where you get the page number of the item you were searching for. Then turn to that page and read what you were looking for. ...

Big Data and Hadoop – Part 2

Apache Hadoop is a fast-growing big data framework. Advantages: Problems with Traditional Large-Scale Systems Processor-bound and lots of complex processing with bigger computers ( changed with distributed computers) Programming complexity Keeping data and processes in sync Finite Bandwidth Partial Failures Distributed Systems: The Data Bottleneck Traditionally, data is stored in a central location Data is copied to processors ar runtime Fine for limited amount of data Types of Analysis with Hadoop: Text Mining Index Building Graph Creation and Analysis Pattern Recognition Collaborative Filtering Prediction Models Sentiment Analysis Risk Assessment Nature of Analysis Batch Processing Parallel Execution Distributed Data Hadoop Users: Black Berry – Growth of data, Analysis Ad hoc Queries took too much time CBS Interactive – Cross site and Historical Analysis, Identify User Pattern Nokia - 3G Digital Modeling, Us...

Big Data and Apache Hadoop – Part 1

Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn't fit the strictures of your database architectures. Big data is a collection of data from traditional and digital sources inside and outside your company that represents a source for ongoing discovery and analysis. Unstructured data comes from information that is not organized or easily interpreted by traditional databases or data models, and typically, it's text-heavy. Metadata, Twitter tweets, and other social media posts are good examples of unstructured data. Multi-structured data refers to a variety of data formats and types and can be derived from interactions between people and machines, such as web applications or social networks. A great example is web log data, which includes a combination of text and visual images along with structured data like form or transactional information. As digital disruption transforms commun...

SQL Server Analysis Services (SSAS)

Business Intelligent Queries OLAP Queries– Online Analytical process Architecture Options Data Source -> Integration Service -> Data mart/Data ware House -> Integration Service -> Analysis Service -> Cube Browser/Reporting Services/SharePoint Services Database Development Design Dimensional Model -> Develop dimensions -> <- Develop cubes -> <- Add Calculations -> <- Deploy to server. Business Intelligent Development Studio Project Types Analysis Service Project – Develop from scratch Import Analysis Service Database Project Items Data Source Connection & Impersonation information Data Source View Cube – at least one for project Dimension – at least one for project Role – at least one for project Dimensional Model Development: Data Source Impersonation account must have READ permission on source Top Down vs. Bottom Up Design Top Down Design the cube and dimensions free...