Build Text and Image Search NodeJS AI App
Vector search is a powerful way to bring the magic of generative AI to your applications. And if you have the right tools, it’s not necessarily difficult. Here, I’ll show you a simple way to build a NodeJS application with DataStax Astra DB (and vector search) support by using stargate-mongoose and a JSON API.
I’m a new engineer on the DataStax Stargate team. It is kind of a tradition that the first ticket in the Stargate team is to develop a demo app by using Stargate’s API. Using Stargate’s new JSON API and the Mongoose driver Stargate-Mongoose, I decided to build a nodeJS app by utilizing both of them.
Computers and cameras are my two obsessions. Whether it’s lines of code or lines that frame a photograph, I enjoy the logic and artistic creativity that make my life colorful. So, to combine my two passions, I decided to build the Photography-Site as a way to organize my work.
To make it an AI app, I also decided to incorporate vector similarity search using the vector search capabilities in Astra DB.
Stargate-Mongoose and JSON API
I am not particularly proficient with JavaScript, but building this app was still relatively effortless.
Mongoose is a widely-used object data mapping tool, often paired with the MongoDB driver, and boasts an active JavaScript developer community. The open-source API framework Stargate offers a new Mongoose driver called stargate-mongoose. It’s an alternative driver for Mongoose, and it is based on Stargate’s new JSON API, which is a stand-alone microservice for Stargate that gives access to data stored in a Cassandra cluster using a JSON Document-based interface.
This collaboration provides Mongoose developers with an open-source solution, marking a pivotal advancement and introducing a significant phase for Apache Cassandra's evolution. Having stargate-mongoose cooperating with Mongoose and the new JSON API, JavaScript developers get a great JSON-oriented data model experience and the ability to build with Cassandra’s scalability and performance.
Vector Search
Stargate JSON API and Stargate-Mongoose provide full support for Astra Vector Search, which empowers AI models with the ability to find specific sets of information in a collection that are the most closely related to a prescribed query. A crucial aspect of this process is the capability to save embedding vectors, which are sets of floating-point numbers used to represent the similarity between distinct objects or entities. Astra DB Vector search integrates this feature into the serverless Astra DB database.
Architecture
The demo app is a Node.js application developed with the Express web application framework. It stores and fetches all data (including vectors) from Astra DB by using Stargate-Mongoose as a Mongoose driver. Stargate-Mongoose relies on the Stargate JSON API to access Astra DB.
As for the vector search part, the app uses the OpenAI embedding API to generate text embedding vectors and Google MediaPipe to generate image embedding vectors. Details for these will be discussed later.
Photography-Site App Walkthrough
Here, I’ll walk through the various operations that are supported by the application and show you some of the key API calls that make this possible.
Basic Functionality
The app supports basic functionality such as image browsing by categories, exploring the latest images, showing random images, adding images, and searching an image by name.
The app presents images by category on the homepage. To store and fetch data in Astra DB using Stargate-Mongoose, we first need to construct the data model. Then, one simple find method will fetch data for you.
const photoSchema = new mongoose.Schema({
//schema fields
});
cons Photo = mongoose.model('photo', photoSchema);
const photosOfCategory = await Photo.find({ 'category': categoryName }).limit(limitNumber);
Once the app obtains the list of photos using the photo model, it can populate the home screen:
Clicking on one specific photo will pull out its detailed information, including photo name, photo category, and photo description.
Behind the scenes, this uses the Mongoose findById
method to get the target photo from Astra DB.
const photo = await Photo.findById(photoId);
The app enables adding photos, including photo name, photo description, category, and photo image itself as input.
When the user clicks “Add Photo,” the app creates a new Photo object and calls the save method; data will be saved into Astra DB.
const newPhoto = new Photo({
name: req.body.name,
description: req.body.description,
category: req.body.category,
image: newImageName,
"$vector": description_embedding,
});
await newPhoto.save();
Text Similarity Search
The app enables searching photos by text similarity. You can describe what photo or scene you want and take that as input to search. Behind the scenes, the feature uses Text Embedding and Datastax vector search.
Remember that every time we add a photo, it requires a photo description as a data model field. You take this description text and call OpenAI text embedding API to get the corresponding embedding vector. Similarly, when doing the text similarity search, you also get an embedding vector for the search text. Then, you can use vector search by using the find and sort method to do a similarity search.
const description_embedding = await getTextEmbedding(searchTerm); cosnt photos = await Photo.find({}).sort({ $vector: { $meta: description_embedding } }).limit(3);
In the following screenshot, as you can see, we searched for “a place for cows to eat.” Then, we got two photo results. They both contain grass and have certain herbivore animals in them. So, the text similarity search result makes sense.
Image Similarity Search
Besides text similarity search, another interesting feature I built is image similarity search. An image embedding vector needs to be generated first. This time, you use Google MediaPipe to generate an image embedding, and the specific model is mobilenet_v3_large.tflite. To perform this, you rely on python-shell to run the Python script in the NodeJS environment. After having image embedding vectors, we can do a vector search.
const photo_embedding = await getPhotoEmbedding(image);
const photos = await PhotoEmbedding.find({}).sort({ $vector: { $meta: photo_embedding } }).limit(3);
Here, we have an image of a “car in the sunset.” We can search it to get similar image results.
As you can see, we have three results. They are all in a sunset color tone, and they all kind of follow a pattern, which is light sky above and dark land below.