Solr for Developers
Overview
This course introduces students to the Solr platform. Through a combination of lecture, discussion, and labs students will gain hands-on experience configuring effective search and indexing.
What You Will Learn
- Solr installation and configuration
- Solr architecture and design
- Faceting, indexing, and search
- Advanced topics including spell checking, suggestions, Multicore, and SolrCloud.
Duration
Two days, optional third day for developers
Audience
Developers, business users, administrators
Prerequisites
All attendees should be experienced technical staff with a background in web application operations and, preferably, development.
Lab environment
Amazon EC2 servers will be provided students for installation, administration and lab work. Students would need an SSH client and a browser to access the cluster.
Zero Install: There is no need to install Solr software on students’ machines! (although it is possible)
Detailed Outline
Overall Goal
Provide experienced web developers and technical staff with a comprehensive introduction to the Solr search platform. Teach software developers deep skills creating search solutions.
Fundamentals
- Solr Overview
- Installing and running Solr
- Adding content to Solr
- Reading a Solr XML response
- Changing parameters in the URL
- Using the browse interface
- Labs: install Solr, run queries
Searching
- Sorting results
- Query parsers
- More queries
- Hardwiring request parameters
- Adding fields to the default search
- Faceting
- Result grouping
- Labs: advanced queries, experiment with faceted search
Indexing
- Adding your own content to Solr
- Deleting data from Solr
- Building a bookstore search
- Adding book data
- Exploring the book data
- Dedupe update processor
- Labs: indexing various document collections
Schema Updating
- Adding fields to the schema
- Analyzing text
- Labs: customize Solr schema
Relevance
- Field weighting
- Phrase queries
- Function queries
- Fuzzier search
- Sounds-like
- Labs: implementing queries for relevance
Extended features
- More-like-this
- Geospatial
- Spell checking
- Suggestions
- Highlighting
- Pseudo-fields
- Pseudo-joins
- Multilanguage
- Labs: implementing spell checking and suggestions
Multicore
- Adding more kinds of data
- Labs: creating and administering cores
SolrCloud
- Introduction
- How SolrCloud works
- Commit strategies
- ZooKeeper
- Managing Solr config files
- Labs: administer SolrCloud
Developer sessions:
Developing with Solr API
- Talking to Solr through REST
- Configuration
- Indexing and searching
- Solr and Spring
- Labs: code to read and write Solr index, exercise in Spring with Solr
Developing with Lucene API
- Building a Lucene index
- Searching, viewing, debugging
- Extracting text with Tika
- Scaling Lucene indices on clusters
- Lucene performance tuning
- Labs: coding with Lucene
Conclusion
- Other approaches to search
- ElasticSearch
- DataStax Enterprise: Solr+Cassandra
- Cloudera Solr integration
- Blur
- Future directions