You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At present, we have 2 interfaces for running recognition: schedulers and workers. Both interfaces are explained in this guide. I plan on combining them into a single interface, taking the syntax from the worker interface, and the "under the hood" functionality from the scheduler interface. This would not have any immediate impact on existing code, however some functions may be depreciated. I believe this will solve several issues that exist at present.
Workers are not a beginner-friendly option
Historically, schedulers have been seen as a secondary "advanced" interface for users who want to implement parallel processing. While it is true that the syntax for creating a new worker is currently simpler than making a scheduler, looking beyond the syntax, schedulers are a significantly more beginner-friendly method of managing workers than doing so directly. When creating/managing workers directly, new users are highly likely to manage workers in a way that causes one of the two issues described below.
Code that manually creates/manages many workers is frequently problematic
When users write code that creates and destroys workers within code that runs asynchronously, their code commonly allows for an undefined number of workers to exist at the same time. For example, say that a Node.js application starts a worker and runs recognition whenever an API endpoint receives an image. This would run fine on a low-traffic server, however if the endpoint was hit 20 times in a short period of time, 20 workers would be created, and the application would crash.
This issues is resolved by implementing using a scheduler, which manages a defined number of workers, which can be easily tweaked according to the capabilities of the device. While in theory a user could still crash their program by making multiple schedulers, saying "only ever use one scheduler" is simpler to communicate than explaining how to properly manage workers.
Code that manually re-uses a single worker may also be problematic
Alternatively, say that the user creates only a single worker, and re-uses it between jobs. While their code will not encounter the issue above, it can still encounter issues if their recognition code can be triggered asynchronously.
Workers are written with the assumption that only one job is run at a time. Unlike with schedulers, workers have no job queuing mechanism. When this assumption is violated, and worker.recognize is called before the previous worker.recognize job is completed, unexpected results can occur (see #875). This can be resolved by using a scheduler (even if there is only one worker), as schedulers will queue jobs.
Having multiple interfaces is confusing to new users
At present, the syntax in the README uses workers manually. If somebody copy/pastes this code into a function, they are likely to cause one of the issues described above. Transitioning to schedulers (the recommended approach) requires reading an entire separate documentation page and rewriting their code to use different syntax.
Planned Changes
The changes this would entail are described below. In short, we would combine the simple syntax of creating and using a worker, with the inner-workings of schedulers (a job queue and support for parallel jobs). I do not believe any of this is technically a breaking change, as the old functions would not go anywhere. However, users would be strongly encouraged to switch from workers to schedulers in documentation. Therefore, should probably still be a major release.
Add scheduler.recognize/scheduler.detect functions that run a job.
This should be the same syntax as worker.recognize/worker.detect, and minimize switching costs.
Expose the workers within a scheduler
This would allow for the same fine-grained control of workers that is currently possible.
At present, we have 2 interfaces for running recognition: schedulers and workers. Both interfaces are explained in this guide. I plan on combining them into a single interface, taking the syntax from the worker interface, and the "under the hood" functionality from the scheduler interface. This would not have any immediate impact on existing code, however some functions may be depreciated. I believe this will solve several issues that exist at present.
Workers are not a beginner-friendly option
Historically, schedulers have been seen as a secondary "advanced" interface for users who want to implement parallel processing. While it is true that the syntax for creating a new worker is currently simpler than making a scheduler, looking beyond the syntax, schedulers are a significantly more beginner-friendly method of managing workers than doing so directly. When creating/managing workers directly, new users are highly likely to manage workers in a way that causes one of the two issues described below.
Code that manually creates/manages many workers is frequently problematic
When users write code that creates and destroys workers within code that runs asynchronously, their code commonly allows for an undefined number of workers to exist at the same time. For example, say that a Node.js application starts a worker and runs recognition whenever an API endpoint receives an image. This would run fine on a low-traffic server, however if the endpoint was hit 20 times in a short period of time, 20 workers would be created, and the application would crash.
This issues is resolved by implementing using a scheduler, which manages a defined number of workers, which can be easily tweaked according to the capabilities of the device. While in theory a user could still crash their program by making multiple schedulers, saying "only ever use one scheduler" is simpler to communicate than explaining how to properly manage workers.
Code that manually re-uses a single worker may also be problematic
Alternatively, say that the user creates only a single worker, and re-uses it between jobs. While their code will not encounter the issue above, it can still encounter issues if their recognition code can be triggered asynchronously.
Workers are written with the assumption that only one job is run at a time. Unlike with schedulers, workers have no job queuing mechanism. When this assumption is violated, and
worker.recognize
is called before the previousworker.recognize
job is completed, unexpected results can occur (see #875). This can be resolved by using a scheduler (even if there is only one worker), as schedulers will queue jobs.Having multiple interfaces is confusing to new users
At present, the syntax in the README uses workers manually. If somebody copy/pastes this code into a function, they are likely to cause one of the issues described above. Transitioning to schedulers (the recommended approach) requires reading an entire separate documentation page and rewriting their code to use different syntax.
Planned Changes
The changes this would entail are described below. In short, we would combine the simple syntax of creating and using a worker, with the inner-workings of schedulers (a job queue and support for parallel jobs). I do not believe any of this is technically a breaking change, as the old functions would not go anywhere. However, users would be strongly encouraged to switch from workers to schedulers in documentation. Therefore, should probably still be a major release.
scheduler.recognize
/scheduler.detect
functions that run a job.worker.recognize
/worker.detect
, and minimize switching costs.1
worker, which would make it a drop-in replacement forcreateWorker
The text was updated successfully, but these errors were encountered: