pgvector-java

pgvector support for Java, Kotlin, Groovy, and Scala

Supports JDBC, Spring JDBC, Groovy SQL, and Slick

Getting Started

For Maven, add to pom.xml under <dependencies>:

<dependency> <groupId>com.pgvector</groupId> <artifactId>pgvector</artifactId> <version>0.1.6</version> </dependency>

For sbt, add to build.sbt:

libraryDependencies +="com.pgvector"%"pgvector"%"0.1.6"

For other build tools, see this page.

And follow the instructions for your database library:

Java - JDBC, Spring JDBC, Hibernate, R2DBC
Kotlin - JDBC
Groovy - JDBC, Groovy SQL
Scala - JDBC, Slick

Or check out some examples:

Embeddings with OpenAI
Binary embeddings with Cohere
Sentence embeddings with Deep Java Library
Hybrid search with Deep Java Library (Reciprocal Rank Fusion)
Sparse search with Text Embeddings Inference
Extended-connectivity fingerprints with the Chemistry Development Kit
Recommendations with Disco
Horizontal scaling with Citus
Bulk loading with COPY

JDBC (Java)

Import the PGvector class

importcom.pgvector.PGvector;

Enable the extension

StatementsetupStmt = conn.createStatement(); setupStmt.executeUpdate("CREATE EXTENSION IF NOT EXISTS vector");

Register the types with your connection

PGvector.registerTypes(conn);

Create a table

StatementcreateStmt = conn.createStatement(); createStmt.executeUpdate("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))");

Insert a vector

PreparedStatementinsertStmt = conn.prepareStatement("INSERT INTO items (embedding) VALUES (?)"); insertStmt.setObject(1, newPGvector(newfloat[]{1, 1, 1})); insertStmt.executeUpdate();

Get the nearest neighbors

PreparedStatementneighborStmt = conn.prepareStatement("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5"); neighborStmt.setObject(1, newPGvector(newfloat[]{1, 1, 1})); ResultSetrs = neighborStmt.executeQuery(); while (rs.next()){System.out.println((PGvector) rs.getObject("embedding"))}

Add an approximate index

StatementindexStmt = conn.createStatement(); indexStmt.executeUpdate("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)"); // orindexStmt.executeUpdate("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)");

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

Spring JDBC

Import the PGvector class

importcom.pgvector.PGvector;

Enable the extension

jdbcTemplate.execute("CREATE EXTENSION IF NOT EXISTS vector");

Create a table

jdbcTemplate.execute("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))");

Insert a vector

Object[] insertParams = newObject[]{newPGvector(newfloat[]{1, 1, 1}) }; jdbcTemplate.update("INSERT INTO items (embedding) VALUES (?)", insertParams);

Get the nearest neighbors

Object[] neighborParams = newObject[]{newPGvector(newfloat[]{1, 1, 1}) }; List<Map<String, Object>> rows = jdbcTemplate.queryForList("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5", neighborParams); for (Maprow : rows){System.out.println(row.get("embedding"))}

Add an approximate index

jdbcTemplate.execute("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)"); // orjdbcTemplate.execute("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)");

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

Hibernate

Hibernate 6.4+ has a vector module (use this instead of com.pgvector.pgvector).

For Maven, add to pom.xml under <dependencies>:

<dependency> <groupId>org.hibernate.orm</groupId> <artifactId>hibernate-vector</artifactId> <version>6.4.0.Final</version> </dependency>

Define an entity

importjakarta.persistence.*; importorg.hibernate.annotations.Array; importorg.hibernate.annotations.JdbcTypeCode; importorg.hibernate.type.SqlTypes; @EntityclassItem{@Id@GeneratedValueprivateLongid; @Column@JdbcTypeCode(SqlTypes.VECTOR) @Array(length = 3) // dimensionsprivatefloat[] embedding; publicvoidsetEmbedding(float[] embedding){this.embedding = embedding} }

Insert a vector

Itemitem = newItem(); item.setEmbedding(newfloat[]{1, 1, 1}); entityManager.persist(item);

Get the nearest neighbors

List<Item> items = entityManager .createQuery("FROM Item ORDER BY l2_distance(embedding, :embedding) LIMIT 5", Item.class) .setParameter("embedding", newfloat[]{1, 1, 1}) .getResultList();

See a full example

R2DBC

R2DBC PostgreSQL 1.0.3+ supports the vector type (use this instead of com.pgvector.pgvector).

For Maven, add to pom.xml under <dependencies>:

<dependency> <groupId>org.postgresql</groupId> <artifactId>r2dbc-postgresql</artifactId> <version>1.0.3.RELEASE</version> </dependency>

JDBC (Kotlin)

Import the PGvector class

importcom.pgvector.PGvector

Enable the extension

val setupStmt = conn.createStatement() setupStmt.executeUpdate("CREATE EXTENSION IF NOT EXISTS vector")

Register the types with your connection

PGvector.registerTypes(conn)

Create a table

val createStmt = conn.createStatement() createStmt.executeUpdate("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")

Insert a vector

val insertStmt = conn.prepareStatement("INSERT INTO items (embedding) VALUES (?)") insertStmt.setObject(1, PGvector(floatArrayOf(1.0f, 1.0f, 1.0f))) insertStmt.executeUpdate()

Get the nearest neighbors

val neighborStmt = conn.prepareStatement("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5") neighborStmt.setObject(1, PGvector(floatArrayOf(1.0f, 1.0f, 1.0f))) val rs = neighborStmt.executeQuery() while (rs.next()){println(rs.getObject("embedding") asPGvector?) }

Add an approximate index

val indexStmt = conn.createStatement() indexStmt.executeUpdate("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)") // or indexStmt.executeUpdate("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

JDBC (Groovy)

Import the PGvector class

importcom.pgvector.PGvector

Enable the extension

def setupStmt = conn.createStatement() setupStmt.executeUpdate("CREATE EXTENSION IF NOT EXISTS vector")

Register the types with your connection

PGvector.registerTypes(conn)

Create a table

def createStmt = conn.createStatement() createStmt.executeUpdate("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")

Insert a vector

def insertStmt = conn.prepareStatement("INSERT INTO items (embedding) VALUES (?)") insertStmt.setObject(1, newPGvector([1, 1, 1] asfloat[])) insertStmt.executeUpdate()

Get the nearest neighbors

def neighborStmt = conn.prepareStatement("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5") neighborStmt.setObject(1, newPGvector([1, 1, 1] asfloat[])) def rs = neighborStmt.executeQuery() while (rs.next()){println((PGvector) rs.getObject("embedding")) }

Add an approximate index

def indexStmt = conn.createStatement() indexStmt.executeUpdate("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)") // or indexStmt.executeUpdate("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

Groovy SQL

Import the PGvector class

importcom.pgvector.PGvector

Enable the extension

sql.execute "CREATE EXTENSION IF NOT EXISTS vector"

Create a table

sql.execute "CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))"

Insert a vector

def params = [newPGvector([1, 1, 1] asfloat[])] sql.executeInsert "INSERT INTO items (embedding) VALUES (?)", params

Get the nearest neighbors

def params = [newPGvector([1, 1, 1] asfloat[])] sql.eachRow("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5", params){row->println row.embedding }

Add an approximate index

sql.execute "CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)"// or sql.execute "CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)"

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

JDBC (Scala)

Import the PGvector class

importcom.pgvector.PGvector

Enable the extension

valsetupStmt = conn.createStatement() setupStmt.executeUpdate("CREATE EXTENSION IF NOT EXISTS vector")

Register the types with your connection

PGvector.registerTypes(conn)

Create a table

valcreateStmt= conn.createStatement() createStmt.executeUpdate("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")

Insert a vector

valinsertStmt= conn.prepareStatement("INSERT INTO items (embedding) VALUES (?)") insertStmt.setObject(1, newPGvector(Array[Float](1, 1, 1))) insertStmt.executeUpdate()

Get the nearest neighbors

valneighborStmt= conn.prepareStatement("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5") neighborStmt.setObject(1, newPGvector(Array[Float](1, 1, 1))) valrs= neighborStmt.executeQuery() while (rs.next()){println(rs.getObject("embedding").asInstanceOf[PGvector]) }

Add an approximate index

valindexStmt= conn.createStatement() indexStmt.executeUpdate("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)") // or indexStmt.executeUpdate("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

Slick

Import the PGvector class

importcom.pgvector.PGvector

Enable the extension

db.run(sqlu"CREATE EXTENSION IF NOT EXISTS vector")

Add a vector column

classItems(tag: Tag) extendsTable[(String)](tag, "items"){defembedding= column[String]("embedding", O.SqlType("vector(3)")) def*= (embedding) }

Insert a vector

valembedding=newPGvector(Array[Float](1, 1, 1)).toString db.run(sqlu"INSERT INTO items (embedding) VALUES ($embedding::vector)")

Get the nearest neighbors

valembedding=newPGvector(Array[Float](1, 1, 1)).toString db.run(sql"SELECT * FROM items ORDER BY embedding <-> $embedding::vector LIMIT 5".as[(String)])

Add an approximate index

db.run(sqlu"CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)") // or db.run(sqlu"CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

Reference

Vectors

Create a vector from an array

PGvectorvec = newPGvector(newfloat[]{1, 2, 3});

Or a List<T>

List<Float> list = List.of(Float.valueOf(1), Float.valueOf(2), Float.valueOf(3)); PGvectorvec = newPGvector(list);

Get an array

float[] arr = vec.toArray();

Half Vectors

Create a half vector from an array

PGhalfvecvec = newPGhalfvec(newfloat[]{1, 2, 3});

Or a List<T>

List<Float> list = List.of(Float.valueOf(1), Float.valueOf(2), Float.valueOf(3)); PGhalfvecvec = newPGhalfvec(list);

Get an array

float[] arr = vec.toArray();

Binary Vectors

Create a binary vector from a byte array

PGbitvec = newPGbit(newbyte[]{(byte) 0b00000000, (byte) 0b11111111});

Or a boolean array

PGbitvec = newPGbit(newboolean[]{true, false, true});

Or a string

PGbitvec = newPGbit("101");

Get the length (number of bits)

intlength = vec.length();

Get a byte array

byte[] bytes = vec.toByteArray();

Or a boolean array

boolean[] bits = vec.toArray();

Sparse Vectors

Create a sparse vector from an array

PGsparsevecvec = newPGsparsevec(newfloat[]{1, 0, 2, 0, 3, 0});

Or a map of non-zero elements

Map<Integer, Float> map = newHashMap<Integer, Float>(); map.put(Integer.valueOf(0), Float.valueOf(1)); map.put(Integer.valueOf(2), Float.valueOf(2)); map.put(Integer.valueOf(4), Float.valueOf(3)); PGsparsevecvec = newPGsparsevec(map, 6);

Note: Indices start at 0

Get the number of dimensions

intdim = vec.getDimensions();

Get the indices of non-zero elements

int[] indices = vec.getIndices();

Get the values of non-zero elements

float[] values = vec.getValues();

Get an array

float[] arr = vec.toArray();

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

Report bugs
Fix bugs and submit pull requests
Write, clarify, or fix documentation
Suggest or add new features

To get started with development:

git clone https://github.com/pgvector/pgvector-java.git cd pgvector-java createdb pgvector_java_test mvn test

To run an example:

cd examples/loading createdb pgvector_example mvn package java -jar target/example-jar-with-dependencies.jar

Name		Name	Last commit message	Last commit date
Latest commit History 211 Commits
.github/workflows		.github/workflows
examples		examples
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE.txt		LICENSE.txt
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pgvector-java

Getting Started

JDBC (Java)

Spring JDBC

Hibernate

R2DBC

JDBC (Kotlin)

JDBC (Groovy)

Groovy SQL

JDBC (Scala)

Slick

Reference

Vectors

Half Vectors

Binary Vectors

Sparse Vectors

History

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

pgvector/pgvector-java

Folders and files

Latest commit

History

Repository files navigation

pgvector-java

Getting Started

JDBC (Java)

Spring JDBC

Hibernate

R2DBC

JDBC (Kotlin)

JDBC (Groovy)

Groovy SQL

JDBC (Scala)

Slick

Reference

Vectors

Half Vectors

Binary Vectors

Sparse Vectors

History

Contributing

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages