mirror of https://github.com/papermerge/mglib
Compare commits
2 Commits
6f7e8ba0e2
...
7ddb02dcb5
Author | SHA1 | Date |
---|---|---|
Eugen Ciur | 7ddb02dcb5 | |
Eugen Ciur | eb98ef1329 |
|
@ -1,5 +1,43 @@
|
|||
class Step:
|
||||
|
||||
# Q: What is ``Step`` and why it was a bad decision to introduce it?
|
||||
#
|
||||
# A: ``Step`` class is closely related to zooming in/zooming out
|
||||
# a specific page in the document in the frontend (javascript code).
|
||||
#
|
||||
# When user opens the document in document viewer, he/she actually
|
||||
# sees an image with text over it (text overlay). Text overlay is
|
||||
# created from hocr data. Very important point here, is that
|
||||
# text hocr data corresponds to (extracted, format jpeg) image of the page
|
||||
# of VERY SAME width/height. Again, hocr file and respective image file
|
||||
# of the page MUST HAVE SAME WIDTH AND HEIGHT.
|
||||
#
|
||||
# Each step is meant to be a specific zoom value of the page. Thus, step
|
||||
# 2, which corresonds to LIST[2] % = 75 % of the page initial logical size
|
||||
# of WIDTH_100p = 1240.
|
||||
# When user zooms in/zooms out - a new hocr file is downloaded
|
||||
# corresponding to that zoom step. As you may guess, user can zoom only
|
||||
# 125%, 100%, 75% and 50%. Value of 10% corresponds to thumbnail of the
|
||||
# document and does not count as 'real' step.
|
||||
#
|
||||
# Instead of doing this step thingy, it would have been better to drop
|
||||
# the entire step concept. Much better solution for zoom in/zoom out would
|
||||
# have been to download one SVG file for each page (instead of hocr) and
|
||||
# SVG file of respective page should contain embedded image
|
||||
# (binary jpeg; yes SVG format allows embedding of binary formats!) and
|
||||
# correctly mapped text overlay (built from hocr file). User later
|
||||
# can zoom in/zoom out using SVG transforations in frontend!
|
||||
#
|
||||
# The good things about SVG solutions are:
|
||||
#
|
||||
# * there will be 4X less OCR required (corresponding to
|
||||
# hOCR of each step minus thumbnail/10% step)
|
||||
# * will simplify front-end code as SVG (= hocr + jpeg) will be
|
||||
# generated on the on server side
|
||||
# * eliminate conept of Step entirely
|
||||
# (there will be only one SVG file per page)
|
||||
# * increase front-end and back-end performance as only one file SVG file
|
||||
# will be sent back and forth (from backend to frontend)
|
||||
#
|
||||
# width of a document when displayed as 100%.
|
||||
WIDTH_100p = 1240
|
||||
PERCENT = 100
|
||||
|
|
Loading…
Reference in New Issue