Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detects correct rotation/degrees, but fails to read the text #940

Open
matsklevstad opened this issue Aug 1, 2024 · 1 comment
Open
Labels
dependency bug Valid bug where fixing is outside the scope of this repo

Comments

@matsklevstad
Copy link

Tesseract.js version (version number for npm/GitHub release, or specific commit for repo)
https://cdn.jsdelivr.net/npm/tesseract.js@5/dist/tesseract.min.js

Describe the bug
When uploading pictures that are either rotated to the left, right og upside down, TesseractJS successfully reports 90, 180 or 270 degrees in most cases. But when reading the text, it only writes "rubbish" like this:

1a1e1014321psåuniuep) ) av 1uaDrpodå 12 134e1hu10.] HLISJAPuf 'NYN NNN NNN 21095 2pusv$oopLs UejEpundg 00'20'0€ (...)

The same image uploaded with correct rotation gives this output, which is great:

FOR VIDEREGÅENDE OPPLÆRING Navn: — Wilhelm Khalid Tjemsland Sletvold Fødselsnummer : 21126423464 har gjennomført opplæring som omfatter i utdanningsprogram for MEMOK ]1---- Medjer og konunuqikasjon, lår Studiespesialisering bestått MEMOK2---- — Medier og kommunikasjon, 2. år Studiespesialisering bestått MEMOK3---- — Medier og kommunikasjon, 3. år Studiespesialisering fullført (...)

To Reproduce
Save the following code in a .html file, and load it in the browser. Upload the image below.

</head> <body> <input type="file" id="uploader" multiple /> <script type="module"> const worker = await Tesseract.createWorker("nor", 2, { legacyCore: true, legacyLang: true, }); worker.setParameters({ tessedit_pageseg_mode: "3" }); const recognize = async function (evt) { const files = evt.target.files; for (let i = 0; i < files.length; i++) { const ret = await worker.recognize( files[i], { rotateAuto: true }, { osd: true } ); const osdAngle = parseFloat( ret.data.osd.match(/Orientation in degrees: (\d+)/)?.[1] ) || 0; const autoRotateAngle = ret.data.rotateRadians * (180 / Math.PI) * -1; const totalAngle = osdAngle + autoRotateAngle; console.log("osdAngle: " + osdAngle + " (degrees)"); console.log("autoRotateAngle: " + autoRotateAngle + " (degrees)"); console.log("totalAngle: " + totalAngle + " (degrees)"); console.log(ret.data.text); } }; const elm = document.getElementById("uploader"); elm.addEventListener("change", recognize); </script> </body> </html>">
<!DOCTYPE html>
<html>
  <head>
   <script src="https://cdn.jsdelivr.net/npm/tesseract.js@5/dist/tesseract.min.js"></script>
  </head>
  <body>
    <input type="file" id="uploader" multiple />
    <script type="module">
      const worker = await Tesseract.createWorker("nor", 2, {
        legacyCore: true,
        legacyLang: true,
      });
      worker.setParameters({ tessedit_pageseg_mode: "3" });
      
const recognize = async function (evt) {
        const files = evt.target.files;

        for (let i = 0; i < files.length; i++) {
          const ret = await worker.recognize(
            files[i],
            { rotateAuto: true },
            { osd: true }
          );

          const osdAngle =
            parseFloat(
              ret.data.osd.match(/Orientation in degrees: (\d+)/)?.[1]
            ) || 0;
          const autoRotateAngle = ret.data.rotateRadians * (180 / Math.PI) * -1;
          const totalAngle = osdAngle + autoRotateAngle;
          console.log("osdAngle: " + osdAngle + " (degrees)");
          console.log("autoRotateAngle: " + autoRotateAngle + " (degrees)");
          console.log("totalAngle: " + totalAngle + " (degrees)");

          console.log(ret.data.text);
        }
      };
      const elm = document.getElementById("uploader");
      elm.addEventListener("change", recognize);
    </script>
  </body>
</html>

Image used:
Screenshot 2024-08-01 at 08 34 40

Expected behavior
Uploading the image gives this:
osdAngle: 270 (degrees)
autoRotateAngle: 0 (degrees)
totalAngle: 270 (degrees)

Which is the correct angle. But should TesseractJS be able to turn the image to 0 degrees and then perform OCR and get a more correct output?

Device Version:

  • macOS 14.5
  • Google Chrome
@Balearica Balearica added the dependency bug Valid bug where fixing is outside the scope of this repo label Aug 1, 2024
@Balearica
Copy link
Member

Thanks for making this new issue, and providing a sample document. I was able to replicate this using the provided code and image.

I am confused as to why this is happening, however this appears to be a bug inherited from the main Tesseract codebase rather than something introduced in the Tesseract.js repo. I tested with my local version of the Tesseract CLI, and experienced the same behavior.

Regarding a path forward, we should check for existing issues in the Tesseract GitHub page to see if this has already been reported. I would assume a bug this notable would have already been reported at some point. The possible outcomes as they pertain to Tesseract.js are:

  1. Ideally the issue has already been reported or fixed, or a fix is in process.
    1. The version of Tesseract used by Tesseract.js lags behind the main release by a small amount, so there is a non-zero chance that we can fix by simply updating to the latest version of Tesseract.
  2. If not, then a fix should be developed and contributed to Tesseract.
    1. The scope of Tesseract.js is a JavaScript/WebAssembly port of Tesseract--as a rule, we try to stay as close to that codebase as possible.
    2. I have contributed patches to Tesseract in the past, so may have time to look into this if needed.
  3. If the Tesseract maintainers are (for some unforeseen reason) resistant to patching, then we can consider implementing something within this codebase.

I am not sure what the root cause is, but I tested this image at various angles, and it appears to recognize correctly at 0 degrees and 90 degrees, but incorrectly at 180 and 270 degrees. Therefore, it appears that orientation is sometimes working as intended, which makes this more perplexing.

@matsklevstad matsklevstad changed the title Detects correct rotating/degress, but fails to read the text Detects correct rotation/degrees, but fails to read the text Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependency bug Valid bug where fixing is outside the scope of this repo
Projects
None yet
Development

No branches or pull requests

2 participants