Ive been experimenting a bit with vector embedding and symantic search for a proof of concept of a customer of ours.
The way it works in Thinkwise is clear to me, and I think I can get that working.
However, all examples are in the documentation are based on small texts, like an email or chat history. In my POC I'm talking about legal documents of >80 pages in which I need to search.
Based on that, some research all show that the best way is to chop the text in chuncks, which can be done in various ways.
ChatGPT has a standard way of chopping text in 800 tokens with an overlap of 400.
At the moment I use the vector embedding of chatGPT but this can become quite expensive, so I'd prefere to do as much as possible locally.
I was wondering if someone already has experience with a best practice for doing so. Maybe some python service that can be called via api or something else? Or any other tips/ / tricks?