將擁抱臉部模型與 Spring AI 和 Ollama 結合使用

1. 概述

人工智慧正在改變我們建立 Web 應用程式的方式。 Hugging Face是一個受歡迎的平台，提供了大量開源和預訓練的法學碩士。

我們可以使用Ollama （一種開源工具）在本機電腦上執行 LLM。它支援運行 Hugging Face 中的GGUF格式模型。

在本教程中，我們將探索如何將 Hugging Face 型號與 Spring AI 和 Ollama 結合使用。我們將使用聊天完成模型建立一個簡單的聊天機器人，並使用嵌入模型實現語義搜尋。

2. 依賴關係

讓我們先為專案的pom.xml檔案加入必要的依賴項：

<dependency>

 <groupId>org.springframework.ai</groupId>

 <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>

 <version>1.0.0-M5</version>

 </dependency>

Ollama 啟動器依賴項可協助我們建立與 Ollama 服務的連線。我們將使用它來拉取並運行我們的聊天完成和嵌入模型。

由於目前版本1.0.0-M5是一個里程碑版本，我們還需要將 Spring Milestones 儲存庫新增到我們的pom.xml中：

<repositories>

 <repository>

 <id>spring-milestones</id>

 <name>Spring Milestones</name>

 <url>https://repo.spring.io/milestone</url>

 <snapshots>

 <enabled>false</enabled>

 </snapshots>

 </repository>

 </repositories>

該儲存庫是發布里程碑版本的地方，而不是標準 Maven 中央儲存庫。

3. 使用測試容器設定 Ollama

為了促進本地開發和測試，我們將使用 Testcontainers 來設定 Ollama 服務。

3.1.測試依賴性

首先，讓我們將必要的測試依賴項新增到pom.xml中：

<dependency>

 <groupId>org.springframework.ai</groupId>

 <artifactId>spring-ai-spring-boot-testcontainers</artifactId>

 <scope>test</scope>

 </dependency>

 <dependency>

 <groupId>org.testcontainers</groupId>

 <artifactId>ollama</artifactId>

 <scope>test</scope>

 </dependency>

我們導入 Spring Boot 的Spring AI Testcontainers依賴項和 Testcontainers 的Ollama 模組。

3.2.定義測試容器 Bean

接下來，讓我們建立一個@TestConfiguration類別來定義我們的 Testcontainers beans：

@TestConfiguration(proxyBeanMethods = false)

 class TestcontainersConfiguration {

 @Bean

 public OllamaContainer ollamaContainer() {

 return new OllamaContainer("ollama/ollama:0.5.4");

 }



 @Bean

 public DynamicPropertyRegistrar dynamicPropertyRegistrar(OllamaContainer ollamaContainer) {

 return registry -> {

 registry.add("spring.ai.ollama.base-url", ollamaContainer::getEndpoint);

 };

 }

 }

我們在建立OllamaContainer bean 時指定 Ollama 映像的最新穩定版本。

然後，我們定義一個DynamicPropertyRegistrar bean來設定Ollama服務的base-url 。這允許我們的應用程式連接到已啟動的 Ollama 容器。

3.3.在開發過程中使用測試容器

雖然 Testcontainers 主要用於整合測試，但我們也可以在本地開發期間使用它。

為了實現這一點，我們將在src/test/java目錄中建立一個單獨的主類別：

public class TestApplication {

 public static void main(String[] args) {

 SpringApplication.from(Application::main)

 .with(TestcontainersConfiguration.class)

 .run(args);

 }

 }

我們建立一個TestApplication類，並在其main()方法中，使用TestcontainersConfiguration類別啟動我們的主Application類別。

此設定可協助我們執行 Spring Boot 應用程式並將其連接到透過 Testcontainers 啟動的 Ollama 服務。

4. 使用聊天完成模型

現在我們已經設定了本地 Ollama 容器，讓我們使用聊天完成模型來建立一個簡單的聊天機器人。

4.1.配置聊天模型和聊天機器人 Bean

讓我們先在application.yaml檔案中設定聊天完成模型：

spring:

 ai:

 ollama:

 init:

 pull-model-strategy: when_missing

 chat:

 options:

 model: hf.co/microsoft/Phi-3-mini-4k-instruct-gguf

為了配置 Hugging Face 模型，我們使用hf.co/{username}/{repository}的格式。在這裡，我們指定 Microsoft 提供的Phi-3-mini-4k-instruct模型的 GGUF 版本。

我們的實作並不嚴格要求使用此模型。**我們的建議是在本地設置程式碼庫並嘗試更多的聊天完成模型**。

此外，我們將pull-model-strategy設定為when_missing 。這可以確保 Spring AI 在本地不可用時拉取指定的模型。

配置有效模型時，Spring AI 會自動建立一個ChatModel類型的 bean ，讓我們可以與聊天完成模型進行互動。

讓我們用它來定義我們的聊天機器人所需的附加 bean：

@Configuration

 class ChatbotConfiguration {

 @Bean

 public ChatMemory chatMemory() {

 return new InMemoryChatMemory();

 }



 @Bean

 public ChatClient chatClient(ChatModel chatModel, ChatMemory chatMemory) {

 return ChatClient

 .builder(chatModel)

 .defaultAdvisors(new MessageChatMemoryAdvisor(chatMemory))

 .build();

 }

 }

首先，我們定義一個ChatMemory bean 並使用InMemoryChatMemory實作。這透過將聊天歷史記錄儲存在記憶體中來維護對話上下文。

接下來，使用ChatMemory和ChatModel bean，我們建立一個ChatClient類型的 bean，這是與聊天完成模型互動的主要入口點。

4.2.實施聊天機器人

配置到位後，讓我們建立一個ChatbotService類別。我們將注入先前定義的ChatClient bean 來與我們的模型進行交互作用。

但首先，讓我們定義兩個簡單的記錄來表示聊天請求和回應：

record ChatRequest(@Nullable UUID chatId, String question) {}



 record ChatResponse(UUID chatId, String answer) {}

ChatRequest包含使用者的question和一個可選的chatId用於識別正在進行的對話。

同樣， ChatResponse包含chatId和聊天機器人的answer 。

現在，讓我們實現預期的功能：

public ChatResponse chat(ChatRequest chatRequest) {

 UUID chatId = Optional

 .ofNullable(chatRequest.chatId())

 .orElse(UUID.randomUUID());

 String answer = chatClient

 .prompt()

 .user(chatRequest.question())

 .advisors(advisorSpec ->

 advisorSpec

 .param("chat_memory_conversation_id", chatId))

 .call()

 .content();

 return new ChatResponse(chatId, answer);

 }

如果傳入請求不包含chatId ，我們會產生一個新的。這允許用戶開始新的對話或繼續現有的對話。

我們將使用者的question傳遞給chatClient bean，並將chat_memory_conversation_id參數設為已解析的chatId以維護對話歷史記錄。

最後，我們返回聊天機器人的answer以及chatId 。

4.3.與我們的聊天機器人交互

現在我們已經實作了服務層，讓我們在其之上公開一個 REST API ：

@PostMapping("/chat")

 public ResponseEntity<ChatResponse> chat(@RequestBody ChatRequest chatRequest) {

 ChatResponse chatResponse = chatbotService.chat(chatRequest);

 return ResponseEntity.ok(chatResponse);

 }

我們將使用上述 API 端點與我們的聊天機器人互動。

讓我們使用 HTTPie CLI 開始新的對話：

http POST :8080/chat question="Who wanted to kill Harry Potter?"

我們向聊天機器人發送一個簡單的問題，讓我們看看我們得到什麼回應：

"chatId": "7b8a36c7-2126-4b80-ac8b-f9eedebff28a",

"answer": "Lord Voldemort, also known as Tom Riddle, wanted to kill Harry Potter because of a prophecy that foretold a boy born at the end of July would have the power to defeat him."

回應包含唯一的chatId和聊天機器人對我們question的answer 。

讓我們使用上述回應中的chatId發送後續question來繼續此對話：

http POST :8080/chat chatId="7b8a36c7-2126-4b80-ac8b-f9eedebff28a" question="Who should he have gone after instead?"

讓我們看看聊天機器人是否可以維護我們的對話上下文並提供相關回應：

"chatId": "7b8a36c7-2126-4b80-ac8b-f9eedebff28a",

"answer": "Based on the prophecy's criteria, Voldemort could have targeted Neville Longbottom instead, as he was also born at the end of July to parents who had defied Voldemort three times."

正如我們所看到的，聊天機器人確實維護了對話上下文，因為它引用了我們在上一條訊息中討論的預言。

chatId保持不變，說明後續的answer是同一個對話的延續。

5. 使用嵌入模型

從聊天完成模型開始，我們現在將使用嵌入模型在小型報價資料集上實現語義搜尋。

我們將從外部 API 獲取報價，將它們儲存在記憶體向量儲存中，並執行語義搜尋。

5.1.從外部 API 取得報價記錄

在我們的示範中，我們將使用QuoteSlate API來取得報價。

讓我們為此創建一個QuoteFetcher實用程式類別：

class QuoteFetcher {

 private static final String BASE_URL = "https://quoteslate.vercel.app";

 private static final String API_PATH = "/api/quotes/random";

 private static final int DEFAULT_COUNT = 50;



 public static List<Quote> fetch() {

 return RestClient

 .create(BASE_URL)

 .get()

 .uri(uriBuilder ->

 uriBuilder

 .path(API_PATH)

 .queryParam("count", DEFAULT_COUNT)

 .build())

 .retrieve()

 .body(new ParameterizedTypeReference<>() {});

 }

 }



 record Quote(String quote, String author) {}

使用RestClient ，我們以預設計數50呼叫 QuoteSlate API，並使用ParameterizedTypeReference將 API 回應反序列化為Quote記錄清單。

5.2.配置和填充內存中向量存儲

現在，讓我們在application.yaml中設定嵌入模型：

spring:

 ai:

 ollama:

 embedding:

 options:

 model: hf.co/nomic-ai/nomic-embed-text-v1.5-GGUF

我們使用nomic-ai提供的[nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF)模型的 GGUF 版本。再次強調，請隨意使用不同的嵌入模型來嘗試此實作。

指定有效的模型後，Spring AI 會自動為我們建立一個EmbeddingModel類型的 bean。

讓我們用它來建立一個向量儲存 bean：

@Bean

 public VectorStore vectorStore(EmbeddingModel embeddingModel) {

 return SimpleVectorStore

 .builder(embeddingModel)

 .build();

 }

為了進行演示，我們建立了一個SimpleVectorStore類別的 bean。它是一個記憶體中實現，使用java.util.Map類別模擬向量儲存。

現在，為了在應用程式啟動期間用引號填充向量存儲，我們將建立一個實作ApplicationRunner介面的VectorStoreInitializer類別：

@Component

 class VectorStoreInitializer implements ApplicationRunner {

 private final VectorStore vectorStore;



 // standard constructor



 @Override

 public void run(ApplicationArguments args) {

 List<Document> documents = QuoteFetcher

 .fetch()

 .stream()

 .map(quote -> {

 Map<String, Object> metadata = Map.of("author", quote.author());

 return new Document(quote.quote(), metadata);

 })

 .toList();

 vectorStore.add(documents);

 }

 }

在我們的VectorStoreInitializer中，我們自動組裝VectorStore的實例。

在run()方法中，我們使用QuoteFetcher實用程式類別來檢索Quote記錄清單。然後，我們將每個quote映射到Document中，並將author欄位配置為metadata 。

最後，我們將所有documents儲存在向量儲存中。當我們呼叫add()方法時，Spring AI 會自動將我們的明文內容轉換為向量表示，然後將其儲存在向量儲存中。我們不需要使用EmbeddingModel bean 來明確轉換它。

5.3.測試語義搜尋

填充向量儲存後，讓我們驗證我們的語義搜尋功能：

private static final int MAX_RESULTS = 3;



 @ParameterizedTest

 @ValueSource(strings = {"Motivation", "Happiness"})

 void whenSearchingQuotesByTheme_thenRelevantQuotesReturned(String theme) {

 SearchRequest searchRequest = SearchRequest

 .builder()

 .query(theme)

 .topK(MAX_RESULTS)

 .build();

 List<Document> documents = vectorStore.similaritySearch(searchRequest);



 assertThat(documents)

 .hasSizeBetween(1, MAX_RESULTS)

 .allSatisfy(document -> {

 String title = String.valueOf(document.getMetadata().get("author"));

 assertThat(title)

 .isNotBlank();

 });

 }

在這裡，我們使用@ValueSource將一些常見的報價主題傳遞給我們的測試方法。然後，我們建立一個SearchRequest對象，其中主題作為查詢， MAX_RESULTS作為所需結果的數量。

接下來，我們使用searchRequest呼叫vectorStore bean 的similaritySearch()方法。與VectorStore的add()方法類似，Spring AI 在查詢向量儲存之前將我們的查詢轉換為其向量表示。

傳回的文件將包含與給定主題語義相關的引號，即使它們不包含確切的關鍵字。

六、結論

在本文中，我們探索了將 Hugging Face 型號與 Spring AI 結合使用。

使用 Testcontainers，我們設定了 Ollama 服務，創建了本地測試環境。

首先，我們使用聊天完成模型來建立一個簡單的聊天機器人。然後，我們使用嵌入模型實現語義搜尋。

與往常一樣，本文中使用的所有程式碼範例都可以在 GitHub 上找到。

本作品係原創或者翻譯，採用《署名-非商業性使用-禁止演繹4.0國際》許可協議