Please take our Survey
logo       

Choosing A Webhost:
A web hosting service is a type of Internet hosting service that allows individuals and organizations to provide their own website accessible via the World Wide Web. Web hosts are companies that provide space on a server they own for use by their clients as well as providing Internet connectivity, typically in a data center. Web hosts can also provide data center space and connectivity to the Internet for servers they do not own to be located in their data center, called colocation. more...

Exploratory Data Mining and Data Cleaning: msg#00117

db.dbworld

Subject: Exploratory Data Mining and Data Cleaning

Data analysts at information-intensive businesses are frequently asked
to analyze new data sets that are often dirty, composed of numerous
tables possessing unknown properties. Prior to analysis, this data must
be cleaned and explored, often a long and arduous task. Ensuring data
quality is a notoriously messy problem that can only be addressed by
drawing on methods from many disciplines, including statistics,
exploratory data mining, database management, and metadata coding.

Where other books on data mining and analysis focus primarily on the
last stage of the analysis procedure, Exploratory Data Mining and Data
Cleaning uses a uniquely integrated approach to data exploration and
data cleaning to develop a suitable modeling strategy that will help
analysts to more effectively determine and implement the final
technique.

The authors, both seasoned data analysts at a major corporation, draw
on their own professional experience to:

* Present a brief overview of the main analytical techniques used
in data mining practices, such as univariate and multivariate
summaries of attributes and their interactions including Q -Q
plots, fractal dimension and histograms, nonparametric approaches
incorporating data depth, and more

*Provide numerous references to the related literature on
clustering, classification, regression, and more

* Focus on developing an evolving modeling strategy through an
iterative data exploration loop and incorporation of domain
knowledge

* Address methods of detecting, quantifying (metrics), and
correcting data quality issues that significantly impact findings
and decisions, using commercially available tools as well as new
algorithmic approaches

* Use case studies to illustrate applications in real- life
scenarios

* Highlight new approaches and methodologies, such as the
DataSphere space partitioning and summary-based analysis techniques

A groundbreaking addition to the existing literature, Exploratory Data
Mining and Data Cleaning serves as an important reference for data
analysts who need to analyze large amounts of unfamiliar data,
operations managers, and students in undergraduate or graduate-level
courses dealing with data analysis and data mining
--------------------------------------------------------------------------
To subscribe or unsubscribe yourself from
dbworld, send a msg to majordomo-hcNo3dDEHLuVc3sceRu5cw@xxxxxxxxxxxxxxxx with
one of these lines:
subscribe dbworld OR unsubscribe dbworld
To find out more options send a msg with the line:
help
To post messages, go to URL www.cs.wisc.edu/dbworld
--------------------------------------------------------------------------




<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
qplus.devel/200...    network.jabber....    debian.qa-packa...    encryption.gpg....    python.dabo.dev...    uclinux.devel/2...    science.mathema...    recreation.pesc...    kernel.ck/2004-...    mozilla.devel.e...    tex.latex.prosp...    ietf.multi6/200...    bbc.cvs/2002-11...    xfree86.newbie/...    jakarta.taglibs...    altlinux.hardwa...    comedi/2002-05/...    horde.bugs/2004...    games.diplomacy...    finance.e-gold....    web.dom.test-su...    lang.ruby.rails...    os.netbsd.devel...    video.gstreamer...   
Home | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe

Navigation