Show HN: Open Rewind – POC for audio and screen and video streaming to S3
github.comGot into a rabbit hole today.
POC works using 'npx efficient-recorder'.
Is this useful to anyone?
Got into a rabbit hole today.
POC works using 'npx efficient-recorder'.
Is this useful to anyone?
capturing and uploading a whole new PNG for each screencap is not what I would call 'efficient', and to meet the use case of Rewind.ai in the first place it should have some OCR mechanism to pull up the relevant screencaps.
The thing that enabled rewind.ai and MS Recall is storing the series of screenshots more like a HEIF, allowing for massive compression ratio and on-device storage + OCR provided by the OS (Live Text since Monterey 2021 [0], Microsoft introduced it last year for Snapdragon based AI-PCs [1])
I guess this is a good starting point if the goal is to fill S3 buckets with screencaps of multiple users, but then we're just back to corporate spyware, not tools for helping individuals use their machine more effectively.
That said, if I was using my own minio backend, it would be neat to archive my screen captures but I would change it so it captures after, say, every keystroke, and every time my mouse stops moving, and after every click. That way I have high density capture of taking actions, and low density otherwise. In any case collecting the data is not the issue, making an interface where that data becomes useful to help me remember something is.
[0] https://support.apple.com/guide/preview/interact-with-text-i...
[1] https://learn.microsoft.com/en-us/windows/ai/apis/text-recog...
PNG per frame? Ouch.
If I was to build one of these, which I'm not, I would try for a RTSP to bucket uploader. That way you could do the actual capture and compression with OBS like any other streamer - or use it with IP security cameras etc. You'd probably end up with a pile of video-ts pieces which could be replayed later using HLS.
I know right :') Not very efficient yet.
I've also explored Swift AVFoundation to drop frames and colours at the moment of recording, but won't be implementing it at this time.
This was just a POC, and couldn't hack it in a day.
ffmpeg works well, especially on apple silicon using video toolbox. That's how I approached it.
Also, automatically doesn't cost storage for identical screenshots (no activity) and very cheap for just moving your mouse around or typing a few characters.
rem looks very intriguing, I'll give it a try, cross platform would be even better ofc. You're doing it without funding?
I wish the LLMs were tuned into the open source landscape more, so when someone has an idea for a POC and asks the bot to write it for them, it would go clippy mode and say "it looks like you're trying to build an open source rewrite of Rewind, would you like to clone the rem repo and contribute to that instead?" lol
No funding- I built mostly over 2023-2024 holidays and then a little here and there.
And yeah- lots of similar projects and/or possible startups have popped up as well
> we're just back to corporate spyware
Most feel that Recall is also this.
Interesting ideas! Seems worth exploring for sure.
> Hypothesis: the world's most valuable data is screen captures of outlier competent people going about their work. But very little of this data is recorded, let alone made publicly available.
It's not quite screen captures, but the way in which any given email is responded to by competent users in your own organization is highly relevant in this context, especially if you place original+reply email pairs into a RAG framework and add function calls for structured domain knowledge.
Unified APIs like https://www.nylas.com/ which an admin can unilaterally connect across an entire org can make this quite viable - assuming you've done the work to build a culture where radical transparency is seen as an opportunity rather than a threat.
There's a lot of nuance required to avoid hallucinations, but organizations that are merely training chatbots on explicit Q&A documents are just scratching the surface of the depth of their semi-structured data.
I pay for Rewind, and honestly, it’s one of the best investments I've made in software. After each Zoom meeting, I receive a summary of everything discussed, including action items to add to my to-do list. Every Monday, I also ask it to remind me of what I accomplished the previous week to help me prepare for my 1:1 meeting. Everything is recorded locally, allowing me to search for anything I did earlier in the day quickly.
Your readme states "MIT License - See LICENSE file for details" but there is no such license file. I've been seeing this a lot lately, did you use an LLM to generate this part of the readme? If so, was MIT a concious choice of yours?
Haha, yes, the README was generated and no, it was not a conscious choice. However I'm happy with the result.
Based on the repository description:
> Attempt to create an Open Source Privacy Focused Rewind.ai Alternative for data capture
I'd assume this was something local or at least for your local network. But this exclusively sends the data over to S3. And based on the lack of encryption keys or even passwords, I'm assuming this is even unecrypted?
It is indeed, but we can use our privately hosted S3 compatible server: https://github.com/minio/minio
I'm happy to think about e2e encryption.
I believe the S3 bit is just as an object storage standard
Love your enthusiasm! Our plan is to subsume Rewind functionality into Limitless. Sorry it has taken longer than I wanted. The pendant has taken a lot of our time and focus.
I'd love that, but most of all I'd prefer to host things on my own servers.
Also I'd buy a pendant if I can send it to my own S3!!!!
A FOSS alternative to Rewind that works on both MacOS and Linux would be a dream come true tbh. Thanks for working on this, I'll be trying it out sometime next week
I took a crack at this, but had trouble building a community. It's all open source.
Native MacOS in swift (the popular one with OCR / text selection from history), and cross platform (rust) without text selection from history and very much POC.
https://github.com/jasonjmcghee/rem
https://github.com/jasonjmcghee/xrem
If you don’t mind discussing it; what kind of community were you looking for/what challenges did you face building one?
The repo looks like it has quite a few stars and a smattering of issues which I thought meant real usage.
Not at all - I meant more on the contribution / builder community of - oh I want this feature, I'll build it and submit a PR! (kind of thing).
It's not quite there yet, this is just a POC and there's lots to win in efficiency.
I think it should be done in Swift tbh to get battery impact under 10%
Nice! This is needed as it seems rewind.ai still stores locally but limitless the product they seem to put more energy into goes to the cloud.
I really like the rewind.ai retrieval mechanism. I believe their recording mechanism is highly broken. It often fails to sync to the os calendar and will ask you to record meetings you deleted months ago.
I don’t understand the webcam recording need. I’m not sure what signal you get from that since if you are in a web meeting you already have that on screen. Or if you are coding you might get a few WTF frown faces if working on a hard bug. But you made it optional, so that’s good.
Thank you all so much for chming in about rewind. I’ve been ruminating about what to do about my subscription. To see that I’m not alone in paying for this app that the founder ditched… I finally feel heard. Thank you!
While we’re here, has anyone been able to export audio from Rewind.ai’s local storage?
I would really like the data to be stored... not in the cloud.
Must one set up a S3 compatible stack on a home server somewhere?
I havent tried yet, but I think we can use https://github.com/minio/minio for this.
Minio works pretty solidly as an s3 compatible endpoint. It took me a while of juggling configuration with Docker to get running correctly though
It's useful to me in that you've identified an interesting niche! I like the idea! As for the implementation, eh, I'd probably rather code my own so it can be in bash or c++ ;-)
It's also useful to me in that it's a solid example of what can be done with LLMs these days, wow!
Also, tangentially, a long long time ago I had a similar system set up, except for packets, not screencaps or audio. A 24h ringbuffer on my router to log _everything_ was a cool-to-have that made debugging network issues easier.
Looks like many open alternatives of Rewind.ai already exist in various levels of completion.[1]
The issue with this one is that it misses the most important feature, the searchability. But you could probably focus on the low overhead aspect of your version.
[1] * Screenpipe https://github.com/mediar-ai/screenpipe * Memento https://github.com/apirrone/Memento * Rem https://github.com/jasonjmcghee/rem
Excellent! This has been on my todo list for a while now, instead I’ll use this and contribute if needed.
How is uploading data to an s3 bucket privacy focused?
Open source makes it possible to host S3 yourself as well as any other data processing later down the line. doesn't need to be the same machine!