Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SIMD detection when corePath is specified #735

Closed
Balearica opened this issue Apr 14, 2023 · 1 comment · Fixed by #745
Closed

Add SIMD detection when corePath is specified #735

Balearica opened this issue Apr 14, 2023 · 1 comment · Fixed by #745

Comments

@Balearica
Copy link
Member

The SIMD-enabled build of tesseract-core.js (tesseract-core-simd.wasm.js) has significantly faster recognition speeds (for Tesseract LSTM, the default model) compared to the build without SIMD support (tesseract-core.wasm.js), although it is not compatible with all browsers and devices. When the user does not specify corePath we automatically detect which version is appropriate. However, when corePath is specified we just use whatever file the user provides, trusting them to pick the correct file. I do not believe many users understand this distinction, and most users are probably just using a single build for all users. We should edit such that Tesseract.js can detect and load the correct version, even if the user wants to self-host Tesseract.js-core or use a different CDN.

let corePathImport = corePath;
if (!corePathImport) {
const simdSupport = await simd();
if (simdSupport) {
corePathImport = `https://unpkg.com/tesseract.js-core@v${dependencies['tesseract.js-core'].substring(1)}/tesseract-core-simd.wasm.js`;
} else {
corePathImport = `https://unpkg.com/tesseract.js-core@v${dependencies['tesseract.js-core'].substring(1)}/tesseract-core.wasm.js`;
}
}

@Balearica
Copy link
Member Author

Unfortunately, I don't think we can change the current behavior for users who already specify a custom corePath argument. If a user specifies corePath to point to a self-hosted copy of tesseract-core.wasm.js, we cannot assume that tesseract-core-simd.wasm.js also exists in the same directory (or vice versa). Therefore, I decided on the following:

  1. If you set corePath to a specific .js file, that file will be loaded
  2. If you set corePath to a directory, either tesseract-core.wasm.js or tesseract-core-simd.wasm.js will be loaded
    1. Both files are assumed to exist in the directory specified

This means users who are already setting corePath to a specific .js file will need to edit their code to see the benefit of this change. This should simply be a matter of (1) deleting tesseract-core.wasm.js from the file path and (2) confirming tesseract-core.wasm.js and tesseract-core-simd.wasm.js exist in the directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant