Elasticsearch vs. Solr: Have Both With Spring Data and Platform.sh
The trendy word 'big data' comes from the 3 Vs: volume, variety, and velocity. Volume refers to the size of data; variety refers to the diverse types of data; and velocity refers to the speed of data processing. To handle persistent big data, there are NoSQL databases that write and read data faster than a SQL database. But with the diversity data inherent in a vast volume of data, a search engine is required to find information without using significant computer power — and without taking too much time. In this post, we’ll talk about two of the most popular search engines, ElasticSearch and Apache Solr, and how Platform.sh supports both.
Based on Apache Lucene, both search engines are open source and written in Java. And they both have beautiful, rich documentation:
Elasticsearch Reference Guide
Solr Reference Guide
To give you an idea of how relevant and useful the Full-Text-Search (FTS) is, this post creates two, straightforward applications for music that both use Spring MVC and the CRUD functionality (Create, Read, Update, and Delete in a database). The unique difference is in the database: one uses Elasticsearch, the other uses Apache Solr. Music has lyrics that contain extensive sequences of words, and full-text engines include the ability to define indexes; it will be faster and more efficient than using LIKE with wildcarding in a SQL database.
Elasticsearch
As noted above, Elasticsearch is based on the Lucene library. It provides a distributed, multitenant-capable, full-text search engine with an HTTP web interface and schema-free JSON documents. And it runs on Java.
In the Maven application, we can easily define the dependency your application needs. In the Elasticsearch application, we need to set the ES library and Platform.sh configuration reader as pom.xml.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>sh.platform.start</groupId>
<artifactId>spring-mvc-maven-elasticsearch</artifactId>
<version>0.0.1</version>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.5.RELEASE</version>
</parent>
<properties>
<java.version>1.8</java.version>
<elasticsearch.version>6.5.0</elasticsearch.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-thymeleaf</artifactId>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>${elasticsearch.version}</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>sh.platform</groupId>
<artifactId>config</artifactId>
<version>2.2.0</version>
</dependency>
</dependencies>
<build>
<finalName>spring-mvc-maven-elasticsearch</finalName>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
The first step in building the application is the music entity — an object defined by a thread of continuity and its identity, rather than by its attributes.
public class Music {
private String id;
private String name;
private String singer;
private int year;
private String lyrics;
//getter and setter
}
Spring Data ES that applies core Spring concepts to the development of solutions uses the Elasticsearch Search Engine. However, it uses TransportClient that’s deprecated in favor of the Java High-Level REST Client and will be removed on ElasticSearch 8.0. Therefore, this application uses the Java High-Level REST Client from Elasticsearch. Using the Bean annotation, we'll provide a RestHighLevelClient instance that will be used on the service layer. This configuration class will create a RestHighLevelClient provided by Platform.sh.
import org.elasticsearch.action.admin.indices.create.CreateIndexRequest;
import org.elasticsearch.action.admin.indices.create.CreateIndexResponse;
import org.elasticsearch.action.admin.indices.get.GetIndexRequest;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import sh.platform.config.Config;
import sh.platform.config.Elasticsearch;
import java.io.IOException;
@Configuration
public class ElasticsearchConfig {
static final String INDEX = "musics";
static final String TYPE = "music";
@Bean
public RestHighLevelClient elasticsearchTemplate() throws IOException {
Config config = new Config();
final Elasticsearch credential = config.getCredential("elasticsearch", Elasticsearch::new);
final RestHighLevelClient client = credential.get();
CreateIndexRequest request = new CreateIndexRequest(INDEX);
GetIndexRequest exist = new GetIndexRequest();
exist.indices(INDEX);
if (!client.indices().exists(exist, RequestOptions.DEFAULT)) {
client.indices().create(request, RequestOptions.DEFAULT);
}
return client;
}
}
The service layer stands on top of the entities
class to handle business requirements that include database control with CRUD operations and a search for terms in the name, lyrics, and singer fields in the music.
import com.fasterxml.jackson.databind.ObjectMapper;
import org.elasticsearch.action.delete.DeleteRequest;
import org.elasticsearch.action.get.GetRequest;
import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.BoolQueryBuilder;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Repository;
import org.thymeleaf.util.StringUtils;
import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.UUID;
import java.util.stream.Collectors;
import static java.util.stream.StreamSupport.stream;
import static org.elasticsearch.index.query.QueryBuilders.boolQuery;
import static org.elasticsearch.index.query.QueryBuilders.termQuery;
import static sh.platform.template.ElasticsearchConfig.INDEX;
import static sh.platform.template.ElasticsearchConfig.TYPE;
@Repository
public class MusicService {
@Autowired
private ObjectMapper objectMapper;
@Autowired
private RestHighLevelClient client;
public List<Music> findAll(String search) throws IOException {
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
if (!StringUtils.isEmpty(search)) {
final BoolQueryBuilder queryBuilder = boolQuery().should(termQuery("lyrics", search))
.should(termQuery("name", search))
.should(termQuery("singer", search));
sourceBuilder.query(queryBuilder);
}
SearchRequest searchRequest = new SearchRequest();
searchRequest.indices(INDEX);
searchRequest.source(sourceBuilder);
SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);
return stream(response.getHits().spliterator(), false)
.map(SearchHit::getSourceAsMap)
.map(s -> objectMapper.convertValue(s, Music.class))
.collect(Collectors.toList());
}
public List<Music> findAll() throws IOException {
return findAll(null);
}
public Optional<Music> findById(String id) throws IOException {
GetRequest request = new GetRequest(INDEX, TYPE, id);
final GetResponse response = client.get(request, RequestOptions.DEFAULT);
final Map<String, Object> source = response.getSource();
if (source.isEmpty()) {
return Optional.empty();
} else {
return Optional.ofNullable(objectMapper.convertValue(source, Music.class));
}
}
public void save(Music music) throws IOException {
if (StringUtils.isEmpty(music.getId())) {
music.setId(UUID.randomUUID().toString());
}
Map<String, Object> jsonMap = objectMapper.convertValue(music, Map.class);
IndexRequest indexRequest = new IndexRequest(INDEX, TYPE)
.id(music.getId()).source(jsonMap);
client.index(indexRequest, RequestOptions.DEFAULT);
}
public void delete(Music music) throws IOException {
client.delete(new DeleteRequest(INDEX, TYPE, music.getId()), RequestOptions.DEFAULT);
}
}
In Spring’s approach to building websites, HTTP requests are handled by a controller. You can quickly identify these requests by the @Controller
annotation. In the following example, the MusicController
handles HTTP requests.
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Controller;
import org.springframework.ui.Model;
import org.springframework.validation.BindingResult;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestParam;
import javax.validation.Valid;
import java.io.IOException;
import static java.util.stream.Collectors.toList;
@Controller
public class MusicController {
@Autowired
private MusicService musicService;
@GetMapping("/")
public String start(@RequestParam(name = "search", required = false) String search, Model model) throws IOException {
model.addAttribute("musics", musicService.findAll(search));
model.addAttribute("search", search);
return "index";
}
@GetMapping("/add")
public String addUser(Model model) {
model.addAttribute("music", new Music());
return "add-music";
}
@PostMapping("/add")
public String addUser(@Valid Music music, BindingResult result, Model model) throws IOException {
if (result.hasErrors()) {
return "add-music";
}
musicService.save(music);
model.addAttribute("musics", musicService.findAll());
return "index";
}
@GetMapping("/edit/{id}")
public String showUpdateForm(@PathVariable("id") String id, Model model) throws IOException {
Music music = musicService.findById(id).orElseThrow(() -> new IllegalArgumentException("Invalid music Id:" + id));
model.addAttribute("music", music);
return "add-music";
}
@PostMapping("/update/{id}")
public String updateUser(@PathVariable("id") String id, @Valid Music music, BindingResult result, Model model) throws IOException {
if (result.hasErrors()) {
music.setId(id);
return "add-music";
}
musicService.save(music);
model.addAttribute("musics", musicService.findAll());
return "index";
}
@GetMapping("/delete/{id}")
public String deleteUser(@PathVariable("id") String id, Model model) throws IOException {
Music music = musicService.findById(id).orElseThrow(() -> new IllegalArgumentException("Invalid music Id:" + id));
musicService.delete(music);
model.addAttribute("musics", musicService.findAll()
.stream()
.filter(m -> !m.getId().equals(id))
.collect(toList()));
return "index";
}
}
Configuring Elasticsearch on Platform.sh can be done in one easy step. Just append this value to the service file:
elasticsearch:
type: elasticsearch:6.5
disk: 256
size: S
To get more details about the configuration files at Platform.sh, please check out this post that explains how your Java application can be created or moved to Platform.sh.
Apache Solr
Solr is an open-source, enterprise search platform written in Java and is part of the Apache Lucene project. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features, and costly document handling.
As in the Elasticsearch application, the Apache Solr sample code needs to define a `pom.xml` with its dependencies.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>sh.platform.start</groupId>
<artifactId>spring-mvc-maven-solr</artifactId>
<version>0.0.1</version>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.5.RELEASE</version>
</parent>
<properties>
<java.version>1.8</java.version>
<elasticsearch.version>6.5.0</elasticsearch.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-thymeleaf</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-solr</artifactId>
<version>4.0.8.RELEASE</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>sh.platform</groupId>
<artifactId>config</artifactId>
<version>2.2.0</version>
</dependency>
</dependencies>
<build>
<finalName>spring-mvc-maven-solr</finalName>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
Thanks to supporting Spring Data for Apache Solr, we can easily configure and access the Apache Solr Search Server from Spring applications. The Music
entity class has annotations to map the fields and execute the translation process between Java and Apache Solr.
import org.springframework.data.annotation.Id;
import org.springframework.data.solr.core.mapping.Indexed;
import org.springframework.data.solr.core.mapping.SolrDocument;
@SolrDocument(collection = "collection1")
public class Music {
@Id
@Indexed(name = "id", type = "string")
private String id;
@Indexed(name = "name", type = "string")
private String name;
@Indexed(name = "singer", type = "string")
private String singer;
@Indexed(name = "year", type = "string")
private int year;
@Indexed(name = "lyrics", type = "string")
private String lyrics;
//getter and setter
}
To use Spring Data features, the instance of SolrTemplate
is required; that’s why we have the SolrConfig
class where we make the SolrTemplate
eligible to Spring. It, then, returns an Apache Solr client instance produced by Platform.sh.
import org.apache.solr.client.solrj.impl.HttpSolrClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.solr.core.SolrTemplate;
import sh.platform.config.Config;
import sh.platform.config.Solr;
@Configuration
public class SolrConfig {
@Bean
public HttpSolrClient elasticsearchTemplate() {
Config config = new Config();
final Solr credential = config.getCredential("solr", Solr::new);
final HttpSolrClient httpSolrClient = credential.get();
String url = httpSolrClient.getBaseURL();
httpSolrClient.setBaseURL(url.substring(0, url.lastIndexOf('/')));
return httpSolrClient;
}
@Bean
public SolrTemplate solrTemplate(HttpSolrClient client) {
return new SolrTemplate(client);
}
}
One of the fantastic benefits of Spring Data is that we have a repository interface that is a central interface to communicate with any database. This repository interface allows the query methods feature to be handed to the developer by Spring Data directly.
import org.springframework.data.solr.repository.Query;
import org.springframework.data.solr.repository.SolrCrudRepository;
import java.util.List;
public interface MusicRepository extends SolrCrudRepository<Music, String> {
@Query("lyrics:*?0* OR name:*?0* OR singer:*?0*")
List<Music> search(String searchTerm);
}
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Repository;
import org.springframework.transaction.annotation.Transactional;
import org.thymeleaf.util.StringUtils;
import java.util.Collections;
import java.util.List;
import java.util.Optional;
import java.util.UUID;
import static java.util.stream.Collectors.toList;
import static java.util.stream.StreamSupport.stream;
@Repository
public class MusicService {
@Autowired
private MusicRepository repository;
public List<Music> findAll(String search) {
if (repository.count() == 0) {
return Collections.emptyList();
}
if (StringUtils.isEmpty(search)) {
return stream(repository.findAll().spliterator(), false)
.collect(toList());
}
return repository.search(search);
}
public List<Music> findAll() {
return findAll(null);
}
public Optional<Music> findById(String id) {
return repository.findById(id);
}
@Transactional
public void save(Music music) {
if (StringUtils.isEmpty(music.getId())) {
music.setId(UUID.randomUUID().toString());
}
repository.save(music);
}
@Transactional
public void delete(Music music) {
repository.delete(music);
}
}
Creating an Apache Solr instance provided by Platform.sh can be done in one easy step. Just append this value to the service file:
solr:
type: solr:7.7
disk: 1024
size: S
configuration:
cores:
collection1:
conf_dir: !archive "core1-conf"
endpoints:
solr:
core: collection1
The Controller for both applications is the same, so we don’t need to duplicate the source code.
Also, both are using the same front-end files, such as HTML, CSS, and JavaScript, thanks to the Thymeleaf template engine. It offers a set of Spring integrations that enable you to use it as a full-featured substitute for JSP in Spring MVC applications.
<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org">
<html>
<head>
<title>Music Store</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="/css/bootstrap.min.css">
</head>
<body>
<div class="container">
<h1>Music Store</h1>
<form class="form-search" method="get" action="/">
<i class="icon-music"></i>
<input type="text" class="input-medium search-query" name="search" >
<button type="submit" class="btn">Search</button>
<a href="/add" role="button" class="btn" data-toggle="modal">Add music</a>
<table class="table table-bordered">
<thead>
<tr>
<th>Music</th>
<th>Year</th>
<th>Singer</th>
<th>Edit</th>
<th>Delete</th>
</tr>
</thead>
<tbody>
<tr th:each="music : ${musics}">
<td th:text="${music.name}"></td>
<td th:text="${music.year}"></td>
<td th:text="${music.singer}"></td>
<td><a th:href="@{/edit/{id}(id=${music.id})}" ><i class="icon-edit"></i></a></td>
<td><a th:href="@{/delete/{id}(id=${music.id})}" ><i class="icon-trash"></i></a></td>
</tr>
</tbody>
</tbody>
</table>
</form>
</div>
<script src="https://code.jquery.com/jquery.js"></script>
<script src="/js/bootstrap.min.js"></script>
</body>
</html>
<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org">
<html>
<head>
<title>Music Store</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="/css/bootstrap.min.css">
</head>
<body>
<div class="container">
<h1>Music Store</h1>
<form th:action="@{/add}" th:object="${music}" method="post">
<input id="id" name="id" type="hidden" th:field="*{id}">
<div class="form-group">
<label for="musicNameId">Name</label>
<input type="text" class="form-control" th:field="*{name}" id="musicNameId" placeholder="Enter Music">
</div>
<div class="form-group">
<label for="musicYearId">Year</label>
<input type="number" class="form-control" th:field="*{year}" id="musicYearId" placeholder="Enter Year"
min="1000" max="2020">
</div>
<div class="form-group">
<label for="musicSingerId">Singer</label>
<input type="text" class="form-control" th:field="*{singer}" id="musicSingerId" placeholder="Enter Singer">
</div>
<div class="form-group">
<label for="musicLyricsId">Lyrics</label>
<textarea class="form-control" id="musicLyricsId" rows="3" th:field="*{lyrics}"></textarea>
</div>
<button type="submit" class="btn">Save</button>
</form>
</div>
<script src="https://code.jquery.com/jquery.js"></script>
<script src="/js/bootstrap.min.js"></script>
</body>
</html>
I hope this post helped explained the benefits of a full-text search engine (instead of a class SQL database), with two fantastic sample applications that compare the code between the most popular FTSs: Elasticsearch and Apache Solr. Both are open source, have rich documentation, and are both supported on Platform.sh!
Platform.sh enables you to very easily leverage Elasticsearch or Apache Solr for your business needs.
We also have a repository with Java code samples beyond the Java templates. And stay tuned: if you're a Jakarta EE enthusiast, keep an eye out for our upcoming post!