Solr for Developers

Overview

This course introduces students to the Solr platform. Through a combination of lecture, discussion, and labs students will gain hands-on experience configuring effective search and indexing.

What You Will Learn

  • Solr installation and configuration
  • Solr architecture and design
  • Faceting, indexing, and search
  • Advanced topics including spell checking, suggestions, Multicore, and SolrCloud.

Duration

Two days, optional third day for developers

Audience

Developers, business users, administrators

Prerequisites

All attendees should be experienced technical staff with a background in web application operations and, preferably, development.

Lab environment

Amazon EC2 servers will be provided students for installation, administration and lab work. Students would need an SSH client and a browser to access the cluster.

Zero Install: There is no need to install Solr software on students’ machines! (although it is possible)

Detailed Outline

Overall Goal

Provide experienced web developers and technical staff with a comprehensive introduction to the Solr search platform. Teach software developers deep skills creating search solutions.

 Fundamentals

  • Solr Overview
  • Installing and running Solr
  • Adding content to Solr
  • Reading a Solr XML response
  • Changing parameters in the URL
  • Using the browse interface
  • Labs: install Solr, run queries

Searching

  • Sorting results
  • Query parsers
  • More queries
  • Hardwiring request parameters
  • Adding fields to the default search
  • Faceting
  • Result grouping
  • Labs: advanced queries, experiment with faceted search

Indexing

  • Adding your own content to Solr
  • Deleting data from Solr
  • Building a bookstore search
  • Adding book data
  • Exploring the book data
  • Dedupe update processor
  • Labs: indexing various document collections

Schema Updating

  • Adding fields to the schema
  • Analyzing text
  • Labs: customize Solr schema

Relevance

  • Field weighting
  • Phrase queries
  • Function queries
  • Fuzzier search
  • Sounds-like
  • Labs: implementing queries for  relevance

Extended features

  • More-like-this
  • Geospatial
  • Spell checking
  • Suggestions
  • Highlighting
  • Pseudo-fields
  • Pseudo-joins
  • Multilanguage
  • Labs: implementing spell checking and suggestions

Multicore

  • Adding more kinds of data
  • Labs: creating and administering cores

SolrCloud

  • Introduction
  • How SolrCloud works
  • Commit strategies
  • ZooKeeper
  • Managing Solr config files
  • Labs: administer SolrCloud

Developer sessions:

Developing with Solr API

  • Talking to Solr through REST
  • Configuration
  • Indexing and searching
  • Solr and Spring
  • Labs: code to read and write Solr index, exercise in Spring with Solr

Developing with Lucene API

  • Building a Lucene index
  • Searching, viewing, debugging
  • Extracting text with Tika
  • Scaling Lucene indices on clusters
  • Lucene performance tuning
  • Labs: coding with Lucene

Conclusion

  • Other approaches to search
    • ElasticSearch
    • DataStax Enterprise: Solr+Cassandra
    • Cloudera Solr integration
    • Blur
  • Future directions