pgvector support for Java, Kotlin, Groovy, and Scala
Supports JDBC, Spring JDBC, Groovy SQL, and Slick
For Maven, add to pom.xml under <dependencies>:
<dependency> <groupId>com.pgvector</groupId> <artifactId>pgvector</artifactId> <version>0.1.6</version> </dependency>For sbt, add to build.sbt:
libraryDependencies +="com.pgvector"%"pgvector"%"0.1.6"For other build tools, see this page.
And follow the instructions for your database library:
- Java - JDBC, Spring JDBC, Hibernate, R2DBC
- Kotlin - JDBC
- Groovy - JDBC, Groovy SQL
- Scala - JDBC, Slick
Or check out some examples:
- Embeddings with OpenAI
- Binary embeddings with Cohere
- Sentence embeddings with Deep Java Library
- Hybrid search with Deep Java Library (Reciprocal Rank Fusion)
- Sparse search with Text Embeddings Inference
- Extended-connectivity fingerprints with the Chemistry Development Kit
- Recommendations with Disco
- Horizontal scaling with Citus
- Bulk loading with
COPY
Import the PGvector class
importcom.pgvector.PGvector;Enable the extension
StatementsetupStmt = conn.createStatement(); setupStmt.executeUpdate("CREATE EXTENSION IF NOT EXISTS vector");Register the types with your connection
PGvector.registerTypes(conn);Create a table
StatementcreateStmt = conn.createStatement(); createStmt.executeUpdate("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))");Insert a vector
PreparedStatementinsertStmt = conn.prepareStatement("INSERT INTO items (embedding) VALUES (?)"); insertStmt.setObject(1, newPGvector(newfloat[]{1, 1, 1})); insertStmt.executeUpdate();Get the nearest neighbors
PreparedStatementneighborStmt = conn.prepareStatement("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5"); neighborStmt.setObject(1, newPGvector(newfloat[]{1, 1, 1})); ResultSetrs = neighborStmt.executeQuery(); while (rs.next()){System.out.println((PGvector) rs.getObject("embedding"))}Add an approximate index
StatementindexStmt = conn.createStatement(); indexStmt.executeUpdate("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)"); // orindexStmt.executeUpdate("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)");Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance
See a full example
Import the PGvector class
importcom.pgvector.PGvector;Enable the extension
jdbcTemplate.execute("CREATE EXTENSION IF NOT EXISTS vector");Create a table
jdbcTemplate.execute("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))");Insert a vector
Object[] insertParams = newObject[]{newPGvector(newfloat[]{1, 1, 1}) }; jdbcTemplate.update("INSERT INTO items (embedding) VALUES (?)", insertParams);Get the nearest neighbors
Object[] neighborParams = newObject[]{newPGvector(newfloat[]{1, 1, 1}) }; List<Map<String, Object>> rows = jdbcTemplate.queryForList("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5", neighborParams); for (Maprow : rows){System.out.println(row.get("embedding"))}Add an approximate index
jdbcTemplate.execute("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)"); // orjdbcTemplate.execute("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)");Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance
See a full example
Hibernate 6.4+ has a vector module (use this instead of com.pgvector.pgvector).
For Maven, add to pom.xml under <dependencies>:
<dependency> <groupId>org.hibernate.orm</groupId> <artifactId>hibernate-vector</artifactId> <version>6.4.0.Final</version> </dependency>Define an entity
importjakarta.persistence.*; importorg.hibernate.annotations.Array; importorg.hibernate.annotations.JdbcTypeCode; importorg.hibernate.type.SqlTypes; @EntityclassItem{@Id@GeneratedValueprivateLongid; @Column@JdbcTypeCode(SqlTypes.VECTOR) @Array(length = 3) // dimensionsprivatefloat[] embedding; publicvoidsetEmbedding(float[] embedding){this.embedding = embedding} }Insert a vector
Itemitem = newItem(); item.setEmbedding(newfloat[]{1, 1, 1}); entityManager.persist(item);Get the nearest neighbors
List<Item> items = entityManager .createQuery("FROM Item ORDER BY l2_distance(embedding, :embedding) LIMIT 5", Item.class) .setParameter("embedding", newfloat[]{1, 1, 1}) .getResultList();See a full example
R2DBC PostgreSQL 1.0.3+ supports the vector type (use this instead of com.pgvector.pgvector).
For Maven, add to pom.xml under <dependencies>:
<dependency> <groupId>org.postgresql</groupId> <artifactId>r2dbc-postgresql</artifactId> <version>1.0.3.RELEASE</version> </dependency>Import the PGvector class
importcom.pgvector.PGvectorEnable the extension
val setupStmt = conn.createStatement() setupStmt.executeUpdate("CREATE EXTENSION IF NOT EXISTS vector")Register the types with your connection
PGvector.registerTypes(conn)Create a table
val createStmt = conn.createStatement() createStmt.executeUpdate("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")Insert a vector
val insertStmt = conn.prepareStatement("INSERT INTO items (embedding) VALUES (?)") insertStmt.setObject(1, PGvector(floatArrayOf(1.0f, 1.0f, 1.0f))) insertStmt.executeUpdate()Get the nearest neighbors
val neighborStmt = conn.prepareStatement("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5") neighborStmt.setObject(1, PGvector(floatArrayOf(1.0f, 1.0f, 1.0f))) val rs = neighborStmt.executeQuery() while (rs.next()){println(rs.getObject("embedding") asPGvector?) }Add an approximate index
val indexStmt = conn.createStatement() indexStmt.executeUpdate("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)") // or indexStmt.executeUpdate("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance
See a full example
Import the PGvector class
importcom.pgvector.PGvectorEnable the extension
def setupStmt = conn.createStatement() setupStmt.executeUpdate("CREATE EXTENSION IF NOT EXISTS vector")Register the types with your connection
PGvector.registerTypes(conn)Create a table
def createStmt = conn.createStatement() createStmt.executeUpdate("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")Insert a vector
def insertStmt = conn.prepareStatement("INSERT INTO items (embedding) VALUES (?)") insertStmt.setObject(1, newPGvector([1, 1, 1] asfloat[])) insertStmt.executeUpdate()Get the nearest neighbors
def neighborStmt = conn.prepareStatement("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5") neighborStmt.setObject(1, newPGvector([1, 1, 1] asfloat[])) def rs = neighborStmt.executeQuery() while (rs.next()){println((PGvector) rs.getObject("embedding")) }Add an approximate index
def indexStmt = conn.createStatement() indexStmt.executeUpdate("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)") // or indexStmt.executeUpdate("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance
See a full example
Import the PGvector class
importcom.pgvector.PGvectorEnable the extension
sql.execute "CREATE EXTENSION IF NOT EXISTS vector"Create a table
sql.execute "CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))"Insert a vector
def params = [newPGvector([1, 1, 1] asfloat[])] sql.executeInsert "INSERT INTO items (embedding) VALUES (?)", paramsGet the nearest neighbors
def params = [newPGvector([1, 1, 1] asfloat[])] sql.eachRow("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5", params){row->println row.embedding }Add an approximate index
sql.execute "CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)"// or sql.execute "CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)"Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance
See a full example
Import the PGvector class
importcom.pgvector.PGvectorEnable the extension
valsetupStmt = conn.createStatement() setupStmt.executeUpdate("CREATE EXTENSION IF NOT EXISTS vector")Register the types with your connection
PGvector.registerTypes(conn)Create a table
valcreateStmt= conn.createStatement() createStmt.executeUpdate("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")Insert a vector
valinsertStmt= conn.prepareStatement("INSERT INTO items (embedding) VALUES (?)") insertStmt.setObject(1, newPGvector(Array[Float](1, 1, 1))) insertStmt.executeUpdate()Get the nearest neighbors
valneighborStmt= conn.prepareStatement("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5") neighborStmt.setObject(1, newPGvector(Array[Float](1, 1, 1))) valrs= neighborStmt.executeQuery() while (rs.next()){println(rs.getObject("embedding").asInstanceOf[PGvector]) }Add an approximate index
valindexStmt= conn.createStatement() indexStmt.executeUpdate("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)") // or indexStmt.executeUpdate("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance
See a full example
Import the PGvector class
importcom.pgvector.PGvectorEnable the extension
db.run(sqlu"CREATE EXTENSION IF NOT EXISTS vector")Add a vector column
classItems(tag: Tag) extendsTable[(String)](tag, "items"){defembedding= column[String]("embedding", O.SqlType("vector(3)")) def*= (embedding) }Insert a vector
valembedding=newPGvector(Array[Float](1, 1, 1)).toString db.run(sqlu"INSERT INTO items (embedding) VALUES ($embedding::vector)")Get the nearest neighbors
valembedding=newPGvector(Array[Float](1, 1, 1)).toString db.run(sql"SELECT * FROM items ORDER BY embedding <-> $embedding::vector LIMIT 5".as[(String)])Add an approximate index
db.run(sqlu"CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)") // or db.run(sqlu"CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance
See a full example
Create a vector from an array
PGvectorvec = newPGvector(newfloat[]{1, 2, 3});Or a List<T>
List<Float> list = List.of(Float.valueOf(1), Float.valueOf(2), Float.valueOf(3)); PGvectorvec = newPGvector(list);Get an array
float[] arr = vec.toArray();Create a half vector from an array
PGhalfvecvec = newPGhalfvec(newfloat[]{1, 2, 3});Or a List<T>
List<Float> list = List.of(Float.valueOf(1), Float.valueOf(2), Float.valueOf(3)); PGhalfvecvec = newPGhalfvec(list);Get an array
float[] arr = vec.toArray();Create a binary vector from a byte array
PGbitvec = newPGbit(newbyte[]{(byte) 0b00000000, (byte) 0b11111111});Or a boolean array
PGbitvec = newPGbit(newboolean[]{true, false, true});Or a string
PGbitvec = newPGbit("101");Get the length (number of bits)
intlength = vec.length();Get a byte array
byte[] bytes = vec.toByteArray();Or a boolean array
boolean[] bits = vec.toArray();Create a sparse vector from an array
PGsparsevecvec = newPGsparsevec(newfloat[]{1, 0, 2, 0, 3, 0});Or a map of non-zero elements
Map<Integer, Float> map = newHashMap<Integer, Float>(); map.put(Integer.valueOf(0), Float.valueOf(1)); map.put(Integer.valueOf(2), Float.valueOf(2)); map.put(Integer.valueOf(4), Float.valueOf(3)); PGsparsevecvec = newPGsparsevec(map, 6);Note: Indices start at 0
Get the number of dimensions
intdim = vec.getDimensions();Get the indices of non-zero elements
int[] indices = vec.getIndices();Get the values of non-zero elements
float[] values = vec.getValues();Get an array
float[] arr = vec.toArray();View the changelog
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
git clone https://github.com/pgvector/pgvector-java.git cd pgvector-java createdb pgvector_java_test mvn testTo run an example:
cd examples/loading createdb pgvector_example mvn package java -jar target/example-jar-with-dependencies.jar