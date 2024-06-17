With Capt2PDF, screen copies can be automatically generated at intervals (eg every five seconds). They are converted into searchable PDF files in the background.>

Artificial intelligence (AI) seems to be all-encompassing and ubiquitous right now. Microsoft is currently gaining some notoriety with its Recall service, newly announced for Windows 11, which aims to become the user’s memory on their computers, so to speak. Screenshots are taken every x seconds. This data is processed “fairly” by AI in the background, so that subsequent research is possible. With Capt2PDF there is a new tool for Linux. In contrast to Microsoft Recall, Capt2PDF runs on the local desktop; Control of the data remains with the user.

There are two basic ways to record screen content. OBS (Open Broadcasting System) is currently mostly used. This creates movie files (including audio) of the screen activity. Searchable PDF files can be created using text and speech recognition.

Movie files created with OBS require a lot of space, recording about 30 images per second. The space required per minute is on the order of several tens of megabytes. Recognizing text and subsequent speech requires a great deal of performance; There is currently no tool that does this automatically.

The classical approach is the alternative. Screenshots on every Linux desktop are probably created using “PrintScreen” (button at the top, third from the right). When it comes to documenting some subsequent processes (for example, in support), the project quickly becomes labor-intensive. Individual image files must be converted into a single file. Most often a PDF file is created. For content to be searchable, text recognition, etc., must be initiated.

Calling Microsoft or AI a “role model”?

Microsoft Recall addresses exactly this concern. However, automatic content recording is not new at all. What’s new is that screenshots are automatically generated in real time so the data can be searched.

Whether the cloud should/should somehow be used for this remains a mystery. Obviously, it has to do with the fact that manufacturers want to “copy” as much data as possible. Capt2PDF shows that there is another way. With Capt2PDF, screenshots are generated at time intervals (for example, 5 seconds). In the background, they are compiled into a searchable PDF file. This makes it easy to search for recorded screenshots.

under https://archivista.ch/cms/wp-content/uploads/2024/06/capt2pdf.zip The source code can be obtained as a zip file. Unzip this file. The program can then be started in a terminal using Perl capt2pdf 5 (recording at 5 second intervals). With Perl capt2pdf 0 logging is stopped. It makes sense to put these two things on a key combination.

On the AVMultimedia desktop (see https://archivista.ch/cms/de/support/avmultimedia) The key combination Ctrl+PrintScreen is available to start recording or Shift+Ctrl+PrintScreen to end recording. In the middle, the screen is recorded at 5-second intervals. After exiting the program, the searchable PDF file will be displayed on your desktop after a few seconds. Easy right?

Does Capt2PDF need AI or the cloud?

Not even remotely. Capt2PDF runs 100% locally, and any reasonably modern CPU is capable of doing the job. Capt2PDF currently contains about 260 lines.

The software is developed for AVMultimedia or ArchivistaBox. But it should also work on most other Linux desktops. The software comes under X11 Scroot To use screenshots, it’s in Wayland Scene or GNOME screenshot.

It should be noted here that in addition to the capture tool, there is minimal software Comparing, turns up, tesseract And pdftk Must be present. Capt2PDF starts one line in bash, generating screenshots until Capt2PDF should exit. After creating a screenshot, Capt2PDF runs, selects the current image, and from there the text is recognized (OCR) in the background. tesseract.

In the ImageMagick utility Comparing The goal is to see if the current screenshot is different from the last screenshot. Only then does it make sense to create a searchable PDF from it. Currently, 1% of the pixels on the screen must be changed or the copy will be discarded. This means that no new PDF screens or pages are created when you move the mouse or update status screens.

If there is more than 1% change in pixels, tesseract Named. Before starting text recognition, use turns up The required resolution (dpi) is stored in the image file. 300 dpi always produces good text recognition (OCR) results. With a 4K display with a resolution of 3860 x 2160 pixels, there is roughly an A4 page in landscape format at 300 dpi (3508 x 2480 dots for A4).

Pages are created directly when you register pages. To make sure slow CPU(s) are not overloaded, Capt2PDF checks the number of CPU cores. If more than half of the CPU’s tesseract In use, text recognition will not occur until sufficient resources are available again. Capt2PDF is “slightly” forgotten here. However, it should be noted that this only happens with very old processors and is usually required tesseract About 2 seconds for one side.

The corresponding PDF files are stored in the home directory (eg /home/arcihvista) under the monitors as individual pages. When Capt2PDF goes out, it comes in pdftk To create a comprehensive PDF file. This process takes just over five to ten seconds, even for hundreds of pages. Finally, a searchable PDF file is created using the viewer com.qpdfview It is shown.

A little infomercial at this point: Anyone who needs a more detailed evaluation or integrated search capability for multiple PDF files will find ArchivistaBox to be a good workhorse. If you don’t want to use a full-fledged document management system (DMS), you can of course merge multiple searchable PDF files created in this way into a single file. The corresponding call is simple:

pdftk file1.pdf file2.pdf output allfiles.pdf

Capt2PDF is of course open source (GPLv2). It was developed for the Linux distribution AVMultimedia (and therefore also for ArchivistaBox) and is available there starting from version 2024/VI. Currently measuring about 260 lines of code, it is an easy-to-use tool that can be used to easily and conveniently capture content when searching the Internet, for example.

In short, it’s a useful program for recording screen contents so you don’t have to open Pandora’s box. This means without any AI or cloud, but simply, transparently (100% open source), locally and in real-time on your Linux desktop.

sources:

https://archivista.ch/cms/de/aktuell-blog/capt2pdf-mit-2024-vi/