Apache Hive
Descripción
However, a challenge remains; how do you move an existing data infrastructure to Hadoop, when that infrastructure is based on traditional relational databases and the Structured Query Language (SQL)? What about the large base of SQL users, both expert database designers and administrators, as well as casual users who use SQL to extract information from their data warehouses? This is where Hive comes in. Hive provides an SQL dialect, called Hive Query Lan- guage (abbreviated HiveQL or just HQL) for querying data stored in a Hadoop cluster. SQL knowledge is widespread for a reason; it’s an effective, reasonably intuitive model for organizing and using data. Mapping these familiar data operations to the low-level MapReduce Java API can be daunting, even for experienced Java developers. Hive does this dirty work for you, so you can focus on the query itself. Hive translates most queries to MapReduce jobs, thereby exploiting the scalability of Hadoop, while presenting a familiar SQL abstraction. If you don’t believe us, see “Java Versus Hive: The Word Count Algorithm” on page 10 later in this chapter.
Detalles
Autor(es): |
Edward Capriolo, Dean Wampler, and Jason Rutherglen |
Año: | 2012 |
Páginas: | 350 |
Lenguaje: | Inglés |
Tamaño: | 8.30 MB |
Categoría: | Bases de Datos |
Etiquetas: | hadoop java SQL base de datos Database clustering |
Usted puede contribuir con Libros UCLV, es importante para nosotros su aporte..
Contribuir