Add ocr tools

main^2
Tait Hoyem 2 years ago
parent ae4217f35e
commit b2ffcb34bc

@ -0,0 +1,73 @@
# OCR Diagram Tools
This suite of tools is to help create a simpler way to get braille diagrams made.
This works upon the *original* image and does not create a new diagram from scratch.
I believe this is optimal for most cases.
You can run each tool individually with `cargo run --bin <binary_name_here> [arg1] [arg2]`.
Go to `braille-diagram-tools` to use the binaries.
## btt-get-ocr
`btt-get-ocr [diagram.png] [out.json]`
This program take the image at `diagram.png` (this is the default value) and output OCR data in JSON format into `out.json` (default).
## btt-label-ocr
`btt-label-ocr [diagram.png] [out.json]`
This program takes the OCR data and lays it overtop the diagram.
How so?
It adds a rectangular box around each OCRed piece of text, along with a label (usually a number starting from 0) to the left of the box.
This allows you to see how acurate the OCR was and to change what is broken with this next tool:
## btt-edit-tools
`btt-edit-tools [out.json]`
The program edits the json (NOT in place) with advanced manipulation functions to help with OCR-related tasks.
The id to the left of the box is very useful right about now.
Here is a list of all the commands you can use while running `btt-edit-tools`:
```
merge|id1|id2 # merges two OCRed sections together, least of x and y to greatest of x+width and y+height.
vsplit|id # split OCRed section into top and bottom pieces.
hsplit|id # split OCRed section into left and right pieces.
add|x|y|w|h # add a new OCR section with x, y, width and height.
rem|id # remove an OCR section
triml|id|dw # change width by dw (delta width)
trimr|id|dw # change width by dw (delta width) and move x the same amount (plus)
trimt|id|dh # change height by dh (delta height) and move y the same amount (minus)
trimb|id|dh # change height by dh (delta height)
movel|id|dx # change x by dx (positive = left move)
mover|id|dx # change x by dx (positive = right move)
moveu|id|dy # change y by dy (positive = upward move)
moved|id|dy # change y by dy (positive = downward move)
paddl|id|dx # change x by dx and add to width same amount
paddr|id|dx # add dx to width
paddt|id|dy # negate dy from y
paddb|id|dy # add dy to height
text|id|some string # make the text of some string (used for braille printing later) associated with id
save|filename # save current json data to filename
```
**NOTE: currently, any command error will crash the program;
this will be fixed eventually.**
## btt-whiteout-labels
`btt-whiteout-labels [out.img] [out.json]`
This program takes an image and a json file and places a filled white box overtop all OCR sections.
This is useful before running the next tool.
## btt-add-braille
`btt-add-braille [out.img] [out.json] [font_size]`
This program takes an image and a json file and based upon the position (x,y) adds text of `font_size` (default `20.0`) where each OCR section is in the JSON.
This often does not work the first time due to the braille using more characters than standard print.
You'll generally need to go back to editing the json, running `whiteout` and running this tool a few times to get it just right.

File diff suppressed because it is too large Load Diff

@ -0,0 +1,16 @@
[package]
name = "braille-diagram-tools"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
leptess = "^0.13.1"
ocr-json-common = { path = "../ocr-json-common" }
serde_json = "^1.0"
imageproc = "^0.22.0"
image = "^0.23.14"
louis = { git = "https://github.com/TTWNO/liblouis-rust" }
rusttype = "^0.9.2"
text_io = "^0.1.9"

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

@ -0,0 +1 @@
[{"id":"0","hint":"A","confidence":0,"width":89,"height":94,"x":212,"y":105},{"id":"1","hint":"AND","confidence":0,"width":333,"height":112,"x":512,"y":500},{"id":"3","hint":"AB","confidence":0,"width":180,"height":130,"x":1080,"y":170},{"id":"2","hint":"B","confidence":0,"width":140,"height":95,"x":162,"y":257}]

@ -0,0 +1 @@
[{"id":"0","hint":"A\n","confidence":96,"width":89,"height":94,"x":212,"y":105},{"id":"1","hint":"AND","confidence":0,"width":333,"height":112,"x":512,"y":500},{"id":"3","hint":"AB","confidence":0,"width":200,"height":100,"x":1075,"y":187},{"id":"2","hint":"B","confidence":0,"width":89,"height":98,"x":212,"y":254}]

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

@ -0,0 +1 @@
[{"id":"0","hint":"A","confidence":0,"width":89,"height":94,"x":212,"y":105},{"id":"1","hint":"AND","confidence":0,"width":333,"height":112,"x":512,"y":500},{"id":"3","hint":"AB","confidence":0,"width":180,"height":130,"x":1080,"y":170},{"id":"2","hint":"B","confidence":0,"width":140,"height":95,"x":162,"y":257}]

@ -0,0 +1 @@
[{"id":"0","hint":"1\n","confidence":90,"width":14,"height":23,"x":322,"y":4},{"id":"1","hint":"1\n","confidence":80,"width":14,"height":23,"x":571,"y":4},{"id":"2","hint":"1\n","confidence":90,"width":14,"height":23,"x":821,"y":4},{"id":"3","hint":"1\n","confidence":90,"width":14,"height":23,"x":1070,"y":4},{"id":"4","hint":"Clock","confidence":0,"width":130,"height":55,"x":6,"y":28},{"id":"5","hint":"0","confidence":0,"width":48,"height":28,"x":418,"y":38},{"id":"6","hint":"0","confidence":0,"width":62,"height":28,"x":663,"y":38},{"id":"7","hint":"0","confidence":0,"width":78,"height":28,"x":912,"y":38},{"id":"8","hint":"Load","confidence":0,"width":134,"height":54,"x":5,"y":91},{"id":"9","hint":"D","confidence":0,"width":130,"height":53,"x":4,"y":167},{"id":"10","hint":"Q","confidence":0,"width":133,"height":35,"x":6,"y":273},{"id":"11","hint":"0","confidence":0,"width":63,"height":28,"x":168,"y":38}]

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.0 KiB

@ -0,0 +1 @@
[{"id":"0","hint":"1\n","confidence":90,"width":14,"height":23,"x":322,"y":4},{"id":"1","hint":"1\n","confidence":80,"width":14,"height":23,"x":571,"y":4},{"id":"2","hint":"1\n","confidence":90,"width":14,"height":23,"x":821,"y":4},{"id":"3","hint":"1\n","confidence":90,"width":14,"height":23,"x":1070,"y":4},{"id":"11","hint":"","confidence":0,"width":63,"height":28,"x":168,"y":38},{"id":"4","hint":"Clock","confidence":0,"width":130,"height":55,"x":6,"y":28},{"id":"5","hint":"0","confidence":0,"width":48,"height":28,"x":418,"y":38},{"id":"6","hint":"0","confidence":0,"width":62,"height":28,"x":663,"y":38},{"id":"7","hint":"0","confidence":0,"width":78,"height":28,"x":912,"y":38},{"id":"8","hint":"Load","confidence":0,"width":134,"height":54,"x":5,"y":91},{"id":"9","hint":"D","confidence":0,"width":130,"height":53,"x":4,"y":167},{"id":"10","hint":"Q","confidence":0,"width":133,"height":35,"x":6,"y":273}]

@ -0,0 +1 @@
[{"id":"0","hint":"branch","confidence":0,"width":78,"height":19,"x":1015,"y":15},{"id":"1","hint":"ALU operation","confidence":0,"width":162,"height":25,"x":792,"y":431},{"id":"2","hint":"add","confidence":0,"width":47,"height":23,"x":567,"y":317},{"id":"3","hint":"pc","confidence":0,"width":39,"height":42,"x":97,"y":557},{"id":"6","hint":"add","confidence":0,"width":47,"height":21,"x":362,"y":329},{"id":"12","hint":"MemRead","confidence":0,"width":125,"height":33,"x":1143,"y":735},{"id":"21","hint":"m","confidence":0,"width":25,"height":27,"x":385,"y":90},{"id":"22","hint":"m","confidence":0,"width":33,"height":28,"x":840,"y":589},{"id":"24","hint":"u","confidence":0,"width":25,"height":25,"x":385,"y":125},{"id":"26","hint":"x","confidence":0,"width":25,"height":25,"x":385,"y":155},{"id":"28","hint":"x","confidence":0,"width":30,"height":30,"x":840,"y":655},{"id":"27","hint":"u","confidence":0,"width":30,"height":30,"x":840,"y":620},{"id":"29","hint":"m","confidence":0,"width":30,"height":30,"x":844,"y":310},{"id":"30","hint":"u","confidence":0,"width":30,"height":30,"x":845,"y":340},{"id":"31","hint":"x","confidence":0,"width":30,"height":30,"x":845,"y":375},{"id":"33","hint":"0","confidence":0,"width":57,"height":25,"x":1000,"y":598},{"id":"4","hint":"Data","confidence":0,"width":56,"height":24,"x":522,"y":463},{"id":"11","hint":"Registers","confidence":0,"width":118,"height":28,"x":624,"y":564},{"id":"14","hint":"Register #","confidence":0,"width":120,"height":29,"x":519,"y":599},{"id":"5","hint":"Register #","confidence":0,"width":119,"height":28,"x":522,"y":532},{"id":"19","hint":"Register #","confidence":0,"width":119,"height":28,"x":522,"y":663},{"id":"25","hint":"control","confidence":0,"width":134,"height":45,"x":527,"y":877},{"id":"13","hint":"Address","confidence":0,"width":98,"height":24,"x":1068,"y":560},{"id":"17","hint":"Data memory","confidence":0,"width":151,"height":76,"x":1094,"y":612},{"id":"23","hint":"Data","confidence":0,"width":56,"height":24,"x":1072,"y":718},{"id":"9","hint":"Instruction","confidence":0,"width":120,"height":23,"x":312,"y":565},{"id":"8","hint":"Address","confidence":0,"width":98,"height":23,"x":192,"y":565},{"id":"16","hint":"Insturction","confidence":0,"width":135,"height":23,"x":199,"y":628},{"id":"18","hint":"memory","confidence":0,"width":104,"height":24,"x":215,"y":665},{"id":"7","hint":"MemWrite","confidence":0,"width":139,"height":24,"x":1128,"y":516},{"id":"32","hint":"ALU","confidence":0,"width":52,"height":25,"x":913,"y":520},{"id":"20","hint":"RegWrite","confidence":0,"width":118,"height":28,"x":642,"y":681},{"id":"15","hint":"4","confidence":0,"width":25,"height":25,"x":278,"y":237}]

Binary file not shown.

After

Width:  |  Height:  |  Size: 171 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 107 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

@ -0,0 +1 @@
[{"id":"15","hint":"","confidence":0,"width":25,"height":25,"x":278,"y":237},{"id":"0","hint":"branch","confidence":0,"width":78,"height":19,"x":1015,"y":15},{"id":"1","hint":"ALU operation","confidence":0,"width":162,"height":25,"x":792,"y":431},{"id":"2","hint":"add","confidence":0,"width":47,"height":23,"x":567,"y":317},{"id":"3","hint":"pc","confidence":0,"width":39,"height":42,"x":97,"y":557},{"id":"6","hint":"add","confidence":0,"width":47,"height":21,"x":362,"y":329},{"id":"12","hint":"MemRead","confidence":0,"width":125,"height":33,"x":1143,"y":735},{"id":"21","hint":"m","confidence":0,"width":25,"height":27,"x":385,"y":90},{"id":"22","hint":"m","confidence":0,"width":33,"height":28,"x":840,"y":589},{"id":"24","hint":"u","confidence":0,"width":25,"height":25,"x":385,"y":125},{"id":"26","hint":"x","confidence":0,"width":25,"height":25,"x":385,"y":155},{"id":"28","hint":"x","confidence":0,"width":30,"height":30,"x":840,"y":655},{"id":"27","hint":"u","confidence":0,"width":30,"height":30,"x":840,"y":620},{"id":"29","hint":"m","confidence":0,"width":30,"height":30,"x":844,"y":310},{"id":"30","hint":"u","confidence":0,"width":30,"height":30,"x":845,"y":340},{"id":"31","hint":"x","confidence":0,"width":30,"height":30,"x":845,"y":375},{"id":"33","hint":"0","confidence":0,"width":57,"height":25,"x":1000,"y":598},{"id":"4","hint":"Data","confidence":0,"width":56,"height":24,"x":522,"y":463},{"id":"11","hint":"Registers","confidence":0,"width":118,"height":28,"x":624,"y":564},{"id":"14","hint":"Register #","confidence":0,"width":120,"height":29,"x":519,"y":599},{"id":"5","hint":"Register #","confidence":0,"width":119,"height":28,"x":522,"y":532},{"id":"19","hint":"Register #","confidence":0,"width":119,"height":28,"x":522,"y":663},{"id":"20","hint":"RegWrite","confidence":0,"width":108,"height":28,"x":662,"y":671},{"id":"25","hint":"control","confidence":0,"width":134,"height":45,"x":527,"y":877},{"id":"13","hint":"Address","confidence":0,"width":98,"height":24,"x":1068,"y":560},{"id":"17","hint":"Data memory","confidence":0,"width":151,"height":76,"x":1094,"y":612},{"id":"23","hint":"Data","confidence":0,"width":56,"height":24,"x":1072,"y":718},{"id":"32","hint":"ALU","confidence":0,"width":52,"height":25,"x":933,"y":560},{"id":"9","hint":"Instruction","confidence":0,"width":120,"height":23,"x":312,"y":565},{"id":"8","hint":"Address","confidence":0,"width":98,"height":23,"x":192,"y":565},{"id":"16","hint":"Insturction","confidence":0,"width":135,"height":23,"x":199,"y":628},{"id":"18","hint":"memory","confidence":0,"width":104,"height":24,"x":215,"y":665},{"id":"7","hint":"MemWrite","confidence":0,"width":139,"height":24,"x":1128,"y":516}]

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

@ -0,0 +1 @@
[{"id":"9","hint":"S3","confidence":0,"width":61,"height":60,"x":300,"y":517},{"id":"10","hint":"S2","confidence":0,"width":60,"height":59,"x":543,"y":517},{"id":"11","hint":"S1","confidence":0,"width":60,"height":59,"x":786,"y":517},{"id":"12","hint":"S0","confidence":0,"width":60,"height":60,"x":1029,"y":517},{"id":"6","hint":"FA","confidence":0,"width":133,"height":61,"x":244,"y":260},{"id":"4","hint":"FA","confidence":0,"width":112,"height":68,"x":491,"y":260},{"id":"7","hint":"FA","confidence":0,"width":124,"height":81,"x":737,"y":260},{"id":"13","hint":"FA","confidence":0,"width":89,"height":76,"x":991,"y":260},{"id":"14","hint":"B3","confidence":0,"width":66,"height":59,"x":321,"y":8},{"id":"0","hint":"A3","confidence":0,"width":66,"height":59,"x":255,"y":8},{"id":"1","hint":"A2","confidence":0,"width":66,"height":58,"x":498,"y":8},{"id":"15","hint":"B2","confidence":0,"width":66,"height":58,"x":564,"y":8},{"id":"2","hint":"A1","confidence":0,"width":66,"height":58,"x":741,"y":8},{"id":"16","hint":"B1","confidence":0,"width":66,"height":58,"x":807,"y":8},{"id":"8","hint":"C0","confidence":0,"width":77,"height":61,"x":1267,"y":270},{"id":"5","hint":"C4","confidence":0,"width":87,"height":60,"x":3,"y":268},{"id":"18","hint":"C3","confidence":0,"width":80,"height":42,"x":390,"y":225},{"id":"19","hint":"C3","confidence":0,"width":75,"height":45,"x":635,"y":225},{"id":"20","hint":"C1","confidence":0,"width":72,"height":45,"x":877,"y":225},{"id":"17","hint":"B0","confidence":0,"width":66,"height":59,"x":1050,"y":8},{"id":"3","hint":"A0","confidence":0,"width":66,"height":59,"x":984,"y":8}]

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

@ -0,0 +1 @@
[{"id":"5","hint":"C4","confidence":0,"width":62,"height":60,"x":28,"y":268},{"id":"9","hint":"S3","confidence":0,"width":61,"height":60,"x":300,"y":517},{"id":"10","hint":"S2","confidence":0,"width":60,"height":59,"x":543,"y":517},{"id":"11","hint":"S1","confidence":0,"width":60,"height":59,"x":786,"y":517},{"id":"12","hint":"S0","confidence":0,"width":60,"height":60,"x":1029,"y":517},{"id":"6","hint":"FA","confidence":0,"width":133,"height":61,"x":244,"y":260},{"id":"14","hint":"","confidence":0,"width":66,"height":59,"x":321,"y":8},{"id":"1","hint":"AsB5","confidence":0,"width":66,"height":58,"x":498,"y":8},{"id":"15","hint":"","confidence":0,"width":66,"height":58,"x":564,"y":8},{"id":"2","hint":"A,B4","confidence":0,"width":66,"height":58,"x":741,"y":8},{"id":"16","hint":"","confidence":0,"width":66,"height":58,"x":807,"y":8},{"id":"3","hint":"AnBg","confidence":0,"width":66,"height":59,"x":984,"y":8},{"id":"17","hint":"","confidence":0,"width":66,"height":59,"x":1050,"y":8},{"id":"0","hint":"ALB4","confidence":0,"width":66,"height":59,"x":255,"y":8},{"id":"18","hint":"","confidence":0,"width":70,"height":67,"x":405,"y":200},{"id":"19","hint":"C3","confidence":0,"width":75,"height":70,"x":645,"y":200},{"id":"20","hint":"C1","confidence":0,"width":72,"height":70,"x":885,"y":200},{"id":"8","hint":"C0","confidence":0,"width":62,"height":61,"x":1282,"y":270},{"id":"4","hint":"FA","confidence":0,"width":112,"height":68,"x":491,"y":260},{"id":"7","hint":"FA","confidence":0,"width":124,"height":81,"x":737,"y":260},{"id":"13","hint":"FA","confidence":0,"width":89,"height":76,"x":991,"y":260}]

@ -0,0 +1 @@
[{"id":"0","hint":"Half Adder","confidence":0,"width":154,"height":24,"x":189,"y":38},{"id":"2","hint":"A","confidence":0,"width":21,"height":26,"x":28,"y":107},{"id":"1","hint":"B","confidence":0,"width":21,"height":26,"x":28,"y":133},{"id":"3","hint":"S (sum)","confidence":0,"width":122,"height":25,"x":437,"y":124},{"id":"5","hint":"C (carry out)","confidence":0,"width":230,"height":28,"x":436,"y":232}]

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

@ -0,0 +1 @@
[{"id":"4","hint":"OR","confidence":0,"width":212,"height":118,"x":439,"y":477},{"id":"2","hint":"A+B","confidence":0,"width":319,"height":109,"x":996,"y":171},{"id":"0","hint":"A","confidence":0,"width":189,"height":93,"x":102,"y":83},{"id":"1","hint":"B","confidence":0,"width":194,"height":130,"x":100,"y":227}]

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

@ -0,0 +1 @@
[{"id":"4","hint":"OR","confidence":0,"width":212,"height":118,"x":439,"y":477},{"id":"2","hint":"A+B","confidence":0,"width":319,"height":109,"x":996,"y":171},{"id":"0","hint":"A","confidence":0,"width":189,"height":93,"x":102,"y":83},{"id":"1","hint":"B","confidence":0,"width":194,"height":130,"x":100,"y":227}]

Binary file not shown.

After

Width:  |  Height:  |  Size: 87 KiB

@ -0,0 +1 @@
[{"id":"0","hint":"A\n","confidence":96,"width":89,"height":93,"x":171,"y":46},{"id":"2","hint":"XOR\n","confidence":77,"width":310,"height":118,"x":440,"y":435},{"id":"1","hint":"B","confidence":0,"width":162,"height":116,"x":89,"y":185},{"id":"3","hint":"A$=[\"6B","confidence":0,"width":350,"height":125,"x":1025,"y":125}]

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

@ -0,0 +1 @@
[{"id":"0","hint":"A\n","confidence":96,"width":89,"height":93,"x":171,"y":46},{"id":"2","hint":"XOR\n","confidence":77,"width":310,"height":118,"x":440,"y":435},{"id":"1","hint":"B::>D—>A@B\n","confidence":0,"width":162,"height":116,"x":89,"y":185},{"id":"3","hint":"","confidence":0,"width":350,"height":125,"x":1025,"y":125}]

@ -0,0 +1,22 @@
#!/bin/bash
prefix="btt-"
cmd=$1
extra=$2
if [[ "$cmd" == "edit-json" ]]; then
# TODO: find easier way
json=$(test -z "$extra" && echo "./out.json" || echo "$extra")
cargo run --bin "${prefix}edit-tools" "$json"
elif [[ "$cmd" == "whiteout" ]]; then
json=$(test -z "$extra" && echo "./out.json" || echo "$extra")
cargo run --bin "${prefix}whiteout-labels" ./diagram.png "$json"
elif [[ "$cmd" == "gen-ocr" ]]; then
cargo run --bin "${prefix}get-ocr" ./diagram.png > out.json
elif [[ "$cmd" == "show-labels" ]]; then
cargo run --bin "${prefix}label-ocr" ./diagram.png ./out.json
elif [[ "$cmd" == "add-braille" ]]; then
json=$(test -z "$extra" && echo "./out.json" || echo "$extra")
cargo run --bin "${prefix}add-braille" ./out.png "$json" $3
fi

@ -0,0 +1,99 @@
DejaVu Fonts License
Fonts are (c) Bitstream (see below). DejaVu changes are in public domain.
Glyphs imported from Arev fonts are (c) Tavmjong Bah (see below)
Bitstream Vera Fonts Copyright
———————————————
Copyright (c) 2003 by Bitstream, Inc. All Rights Reserved. Bitstream Vera is
a trademark of Bitstream, Inc.
Permission is hereby granted, free of charge, to any person obtaining a copy
of the fonts accompanying this license (“Fonts”) and associated
documentation files (the “Font Software”), to reproduce and distribute the
Font Software, including without limitation the rights to use, copy, merge,
publish, distribute, and/or sell copies of the Font Software, and to permit
persons to whom the Font Software is furnished to do so, subject to the
following conditions:
The above copyright and trademark notices and this permission notice shall
be included in all copies of one or more of the Font Software typefaces.
The Font Software may be modified, altered, or added to, and in particular
the designs of glyphs or characters in the Fonts may be modified and
additional glyphs or characters may be added to the Fonts, only if the fonts
are renamed to names not containing either the words “Bitstream” or the word
“Vera”.
This License becomes null and void to the extent applicable to Fonts or Font
Software that has been modified and is distributed under the “Bitstream
Vera” names.
The Font Software may be sold as part of a larger software package but no
copy of one or more of the Font Software typefaces may be sold by itself.
THE FONT SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF COPYRIGHT, PATENT,
TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL BITSTREAM OR THE GNOME
FOUNDATION BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, INCLUDING
ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF
THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM OTHER DEALINGS IN THE
FONT SOFTWARE.
Except as contained in this notice, the names of Gnome, the Gnome
Foundation, and Bitstream Inc., shall not be used in advertising or
otherwise to promote the sale, use or other dealings in this Font Software
without prior written authorization from the Gnome Foundation or Bitstream
Inc., respectively. For further information, contact: fonts at gnome dot
org.
Arev Fonts Copyright
———————————————
Copyright (c) 2006 by Tavmjong Bah. All Rights Reserved.
Permission is hereby granted, free of charge, to any person obtaining
a copy of the fonts accompanying this license (“Fonts”) and
associated documentation files (the “Font Software”), to reproduce
and distribute the modifications to the Bitstream Vera Font Software,
including without limitation the rights to use, copy, merge, publish,
distribute, and/or sell copies of the Font Software, and to permit
persons to whom the Font Software is furnished to do so, subject to
the following conditions:
The above copyright and trademark notices and this permission notice
shall be included in all copies of one or more of the Font Software
typefaces.
The Font Software may be modified, altered, or added to, and in
particular the designs of glyphs or characters in the Fonts may be
modified and additional glyphs or characters may be added to the
Fonts, only if the fonts are renamed to names not containing either
the words “Tavmjong Bah” or the word “Arev”.
This License becomes null and void to the extent applicable to Fonts
or Font Software that has been modified and is distributed under the
“Tavmjong Bah Arev” names.
The Font Software may be sold as part of a larger software package but
no copy of one or more of the Font Software typefaces may be sold by
itself.
THE FONT SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT
OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL
TAVMJONG BAH BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL
DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM
OTHER DEALINGS IN THE FONT SOFTWARE.
Except as contained in this notice, the name of Tavmjong Bah shall not
be used in advertising or otherwise to promote the sale, use or other
dealings in this Font Software without prior written authorization
from Tavmjong Bah. For further information, contact: tavmjong @ free
. fr.

@ -0,0 +1,51 @@
use std::fs;
use ocr_json_common::TextBox;
use image::{Rgba};
use imageproc::drawing::{
draw_text_mut,
};
use std::env;
use std::path::Path;
use rusttype::{Font, Scale};
use louis::Louis;
use louis::modes::DOTS_UNICODE;
fn main() {
let img_file_name = if env::args().count() >= 2 {
env::args().nth(1).unwrap()
} else {
panic!("Please enter a target file path for image")
};
let json_file_name = if env::args().count() >= 3 {
env::args().nth(2).unwrap()
} else{
panic!("Please enter a target file path for json")
};
let font_size_str = if env::args().count() >= 4 {
env::args().nth(3).unwrap()
} else {
"20.0".to_string()
};
let font_size = font_size_str.parse().unwrap();
let json = fs::read_to_string(json_file_name).expect("There was an error reading the file.");
let ocr_rects: Vec<TextBox> = serde_json::from_str(&json).unwrap();
let image_path = Path::new(&img_file_name);
let black = Rgba([0u8, 0u8, 0u8, 255u8]);
let font_data: &[u8] = include_bytes!("../../fonts/UBraille.ttf");
let font: Font<'static> = Font::try_from_bytes(font_data).expect("Error loading font.");
let mut img = image::open(image_path).unwrap();
let brl = Louis::new().unwrap();
// run OCR on each word bounding box
for rect in &ocr_rects {
let text = rect.hint.clone();
let brl_text = brl.translate_simple("en_US.tbl", &text, false, DOTS_UNICODE);
println!("[{}]: {}", rect.id, brl_text);
draw_text_mut(&mut img, black, rect.x.try_into().unwrap(), rect.y.try_into().unwrap(), Scale::uniform(font_size), &font, &brl_text);
}
img.save("out.png").unwrap();
}

@ -0,0 +1,473 @@
use ocr_json_common::TextBox;
use serde_json;
use std::{
env,
cmp::min,
process::Command,
io::Write,
fs::OpenOptions,
fs,
};
use text_io::read;
// TODO: make more extensible
fn save_json(json: String, fname: &String) {
let mut file = OpenOptions::new().write(true).truncate(true).create(true).open(fname).expect("Unable to open file");
file.write_all(json.as_bytes()).expect("unable to write to file");
}
// TODO: make more exensible!!!
fn reload_json(fname: &String) {
Command::new("cargo")
.arg("run")
.arg("--bin")
.arg("btt-label-ocr")
.arg("./diagram.png")
.arg(fname)
.output()
.expect("Failed to execute command");
}
fn new_id(ids: &Vec<String>) -> String {
let mut new_id = 0;
let mut new_sid = format!("{}", new_id);
while ids.contains(&new_sid) {
new_sid = format!("{}", new_id);
new_id+=1;
}
new_sid
}
fn rem_rect(boxes: &mut Vec<TextBox>, id: String) {
boxes.retain(|b| b.id != id);
}
fn new_rect(boxes: &mut Vec<TextBox>, id: String, x: String, y: String, w: String, h: String) {
// TODO: unsafe
boxes.push(TextBox {
id,
x: x.parse().unwrap(),
y: y.parse().unwrap(),
width: w.parse().unwrap(),
height: h.parse().unwrap(),
hint: String::new(),
confidence: 0
});
}
fn set_text(boxes: &mut Vec<TextBox>, xid: String, new_text: String) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == xid).unwrap().clone();
boxes.retain(|b| b.id != xid);
boxes.push(TextBox{
id: bx.id,
confidence: 0,
hint: new_text.clone(),
x: bx.x,
y: bx.y,
width: bx.width,
height: bx.height,
});
}
fn merge(boxes: &mut Vec<TextBox>, xid: String, yid: String) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == xid).unwrap();
// TODO: unsafe
let by = boxes.iter().find(|b| b.id == yid).unwrap();
let y = min(bx.y, by.y);
let x = min(bx.x, by.x);
let w = (bx.x - by.x).abs() + (if x == bx.x {by.width as i32} else {bx.width as i32});
let h = (bx.y - by.y).abs() + (if y == by.y {by.height as i32} else {bx.height as i32});
let text = format!("{} {}", bx.hint, by.hint);
let confi = 0;
let id = bx.id.clone();
boxes.retain(|b| b.id != xid && b.id != yid);
boxes.push(TextBox {
id,
confidence: confi,
hint: text,
x,
y,
width: w as u32,
height: h as u32,
});
}
fn vsplit(boxes: &mut Vec<TextBox>, sid: String) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap();
let w = bx.width;
let x = bx.x;
let h = bx.height/2;
let y1 = bx.y;
let y2 = bx.y + (bx.height as i32)/2;
let hint = bx.hint.clone();
let mut tsplit = hint.split("\n"); // tesseract likes to use newlines for some reason... use to our advantage
let t1 = tsplit.next().unwrap_or("");
let t2 = tsplit.next().unwrap_or("");
let id1 = bx.id.clone();
let ids: Vec<String> = boxes.iter().map(|b| b.id.clone()).collect();
let id2 = new_id(&ids);
boxes.retain(|b| b.id != sid);
boxes.push(TextBox {
x,
y: y1,
hint: t1.to_string().clone(),
confidence: 0,
id: id1,
width: w,
height: h,
});
boxes.push(TextBox {
x,
y: y2,
hint: t2.to_string().clone(),
confidence: 0,
id: id2,
width: w,
height: h,
});
}
fn hsplit(boxes: &mut Vec<TextBox>, sid: String) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap();
let w = bx.width/2;
let y = bx.y;
let h = bx.height;
let x1 = bx.x;
let x2 = bx.x + (bx.width as i32)/2;
let hint = bx.hint.clone();
let mut tsplit = hint.split("\n"); // tesseract likes to use newlines for some reason... use to our advantage
let t1 = tsplit.next().unwrap_or("");
let t2 = tsplit.next().unwrap_or("");
let id1 = bx.id.clone();
let ids: Vec<String> = boxes.iter().map(|b| b.id.clone()).collect();
let id2 = new_id(&ids);
boxes.retain(|b| b.id != sid);
boxes.push(TextBox {
x: x1,
y,
hint: t1.to_string().clone(),
confidence: 0,
id: id1,
width: w,
height: h,
});
boxes.push(TextBox {
x: x2,
y,
hint: t2.to_string().clone(),
confidence: 0,
id: id2,
width: w,
height: h,
});
}
fn triml(boxes: &mut Vec<TextBox>, sid: String, mw: i32) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap().clone();
boxes.retain(|b| b.id != sid);
boxes.push(TextBox{
id: bx.id,
hint: bx.hint,
x: bx.x + mw,
width: bx.width - (mw as u32),
y: bx.y,
height: bx.height,
confidence: 0,
});
}
fn trimr(boxes: &mut Vec<TextBox>, sid: String, mw: i32) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap().clone();
boxes.retain(|b| b.id != sid);
boxes.push(TextBox{
id: bx.id,
hint: bx.hint,
x: bx.x,
width: bx.width - (mw as u32),
y: bx.y,
height: bx.height,
confidence: 0,
});
}
fn trimt(boxes: &mut Vec<TextBox>, sid: String, mw: i32) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap().clone();
boxes.retain(|b| b.id != sid);
boxes.push(TextBox{
id: bx.id,
hint: bx.hint,
x: bx.x,
width: bx.width,
y: bx.y + mw,
height: bx.height - (mw as u32),
confidence: 0,
});
}
fn trimb(boxes: &mut Vec<TextBox>, sid: String, mw: i32) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap().clone();
boxes.retain(|b| b.id != sid);
boxes.push(TextBox{
id: bx.id,
hint: bx.hint,
x: bx.x,
width: bx.width,
y: bx.y,
height: bx.height - (mw as u32),
confidence: 0,
});
}
fn movel(boxes: &mut Vec<TextBox>, sid: String, mw: i32) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap().clone();
boxes.retain(|b| b.id != sid);
boxes.push(TextBox{
id: bx.id,
hint: bx.hint,
x: bx.x - mw,
width: bx.width,
y: bx.y,
height: bx.height,
confidence: 0,
});
}
fn mover(boxes: &mut Vec<TextBox>, sid: String, mw: i32) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap().clone();
boxes.retain(|b| b.id != sid);
boxes.push(TextBox{
id: bx.id,
hint: bx.hint,
x: bx.x + mw,
width: bx.width,
y: bx.y,
height: bx.height,
confidence: 0,
});
}
fn moveu(boxes: &mut Vec<TextBox>, sid: String, mw: i32) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap().clone();
boxes.retain(|b| b.id != sid);
boxes.push(TextBox{
id: bx.id,
hint: bx.hint,
x: bx.x,
width: bx.width,
y: bx.y - mw,
height: bx.height,
confidence: 0,
});
}
fn moved(boxes: &mut Vec<TextBox>, sid: String, mw: i32) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap().clone();
boxes.retain(|b| b.id != sid);
boxes.push(TextBox{
id: bx.id,
hint: bx.hint,
x: bx.x,
width: bx.width,
y: bx.y + mw,
height: bx.height,
confidence: 0,
});
}
fn paddl(boxes: &mut Vec<TextBox>, sid: String, mw: i32) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap().clone();
boxes.retain(|b| b.id != sid);
boxes.push(TextBox{
id: bx.id,
hint: bx.hint,
x: bx.x - mw,
width: bx.width + (mw as u32),
y: bx.y,
height: bx.height,
confidence: 0,
});
}
fn paddr(boxes: &mut Vec<TextBox>, sid: String, mw: i32) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap().clone();
boxes.retain(|b| b.id != sid);
boxes.push(TextBox{
id: bx.id,
hint: bx.hint,
x: bx.x,
width: bx.width + (mw as u32),
y: bx.y,
height: bx.height,
confidence: 0,
});
}
fn paddt(boxes: &mut Vec<TextBox>, sid: String, mw: i32) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap().clone();
boxes.retain(|b| b.id != sid);
boxes.push(TextBox{
id: bx.id,
hint: bx.hint,
x: bx.x,
width: bx.width,
y: bx.y - mw,
height: bx.height + (mw as u32),
confidence: 0,
});
}
fn paddb(boxes: &mut Vec<TextBox>, sid: String, mw: i32) {
// TODO: unsafe
let bx = boxes.iter().find(|b| b.id == sid).unwrap().clone();
boxes.retain(|b| b.id != sid);
boxes.push(TextBox{
id: bx.id,
hint: bx.hint,
x: bx.x,
width: bx.width,
y: bx.y,
height: bx.height + (mw as u32),
confidence: 0,
});
}
fn main() {
let json_fname = if env::args().count() == 2 {
env::args().nth(1).unwrap()
} else {
panic!("Please enter a file path");
};
let json_str = fs::read_to_string(json_fname.clone()).expect("There was an error reading the provided file.");
let mut boxes: Vec<TextBox> = serde_json::from_str(&json_str).unwrap();
let mut line: String = String::new();
println!("Type 'exit' to quit!");
while line != "exit" {
line = read!("{}\n");
let mut split = line.split("|");
let command = split.next();
if command == Some("merge") {
// TODO: not safe
let one_id = split.next().unwrap();
let two_id = split.next().unwrap();
merge(&mut boxes, one_id.to_string(), two_id.to_string());
} else if command == Some("vsplit") {
// TODO: not safe
let id = split.next().unwrap();
vsplit(&mut boxes, id.to_string());
} else if command == Some("hsplit") {
// TODO: not safe
let id = split.next().unwrap();
hsplit(&mut boxes, id.to_string());
} else if command == Some("triml") {
// TODO: not safe
let id = split.next().unwrap();
let px = split.next().unwrap();
triml(&mut boxes, id.to_string(), px.parse().unwrap());
} else if command == Some("trimr") {
// TODO: not safe
let id = split.next().unwrap();
let px = split.next().unwrap();
trimr(&mut boxes, id.to_string(), px.parse().unwrap());
} else if command == Some("trimt") {
// TODO: not safe
let id = split.next().unwrap();
let px = split.next().unwrap();
trimt(&mut boxes, id.to_string(), px.parse().unwrap());
} else if command == Some("trimb") {
// TODO: not safe
let id = split.next().unwrap();
let px = split.next().unwrap();
trimb(&mut boxes, id.to_string(), px.parse().unwrap());
} else if command == Some("text") {
// TODO: not safe
let id = split.next().unwrap();
let new_text = split.next().unwrap();
set_text(&mut boxes, id.to_string(), new_text.to_string());
} else if command == Some("moveu") {
// TODO: not safe
let id = split.next().unwrap().to_string();
let diff = split.next().unwrap().parse().unwrap();
moveu(&mut boxes, id, diff);
} else if command == Some("moved") {
// TODO: not safe
let id = split.next().unwrap().to_string();
let diff = split.next().unwrap().parse().unwrap();
moved(&mut boxes, id, diff);
} else if command == Some("mover") {
// TODO: not safe
let id = split.next().unwrap().to_string();
let diff = split.next().unwrap().parse().unwrap();
mover(&mut boxes, id, diff);
} else if command == Some("movel") {
// TODO: not safe
let id = split.next().unwrap().to_string();
let diff = split.next().unwrap().parse().unwrap();
movel(&mut boxes, id, diff);
} else if command == Some("paddl") {
// TODO: not safe
let id = split.next().unwrap().to_string();
let diff = split.next().unwrap().parse().unwrap();
paddl(&mut boxes, id, diff);
} else if command == Some("paddr") {
// TODO: not safe
let id = split.next().unwrap().to_string();
let diff = split.next().unwrap().parse().unwrap();
paddr(&mut boxes, id, diff);
} else if command == Some("paddt") {
// TODO: not safe
let id = split.next().unwrap().to_string();
let diff = split.next().unwrap().parse().unwrap();
paddt(&mut boxes, id, diff);
} else if command == Some("paddb") {
// TODO: not safe
let id = split.next().unwrap().to_string();
let diff = split.next().unwrap().parse().unwrap();
paddb(&mut boxes, id, diff);
} else if command == Some("add") {
// TODO: not safe
let ids: Vec<String> = boxes.iter().map(|b| b.id.clone()).collect();
let id = new_id(&ids);
let x = split.next().unwrap();
let y = split.next().unwrap();
let w = split.next().unwrap();
let h = split.next().unwrap();
new_rect(&mut boxes,
id.to_string(),
x.to_string(),
y.to_string(),
w.to_string(),
h.to_string());
} else if command == Some("save") {
// TODO: unsafe
let fname = split.next().unwrap();
let json_out = serde_json::to_string(&boxes).unwrap();
save_json(json_out, &fname.to_string());
println!("Saved as {}", fname);
} else if command == Some("show") {
// TODO: VERY unsafe
let id = split.next().unwrap();
let bx = boxes.iter().find(|b| b.id == id).unwrap();
let json = serde_json::to_string(bx).unwrap();
println!("JSON: {}", json);
} else if command == Some("rem") {
// TODO: unsafe
let id = split.next().unwrap();
rem_rect(&mut boxes, id.to_string());
} else if command == Some("exit") {
continue;
} else {
println!("Invalid command.");
}
let json_out = serde_json::to_string(&boxes).unwrap();
save_json(json_out, &json_fname);
reload_json(&json_fname);
}
}

@ -0,0 +1,53 @@
extern crate leptess;
use ocr_json_common::TextBox;
use leptess::{leptonica, tesseract};
use std::env;
use std::path::Path;
/* TODO: preprox here */
fn main() {
let mut ocr_rects = Vec::new();
let file_name = if env::args().count() == 2 {
env::args().nth(1).unwrap()
} else {
panic!("Please enter a target file path")
};
let image_path = Path::new(&file_name);
let mut api = tesseract::TessApi::new(Some("/usr/share/tessdata/"), "eng").unwrap();
let pix = leptonica::pix_read(image_path).unwrap();
api.set_image(&pix);
// detect bounding boxes for words
let boxes = api
.get_component_images(leptess::capi::TessPageIteratorLevel_RIL_WORD, true)
.unwrap();
let mut boxid = 0;
// run OCR on each word bounding box
for b in &boxes {
api.set_rectangle(&b);
let text = api.get_utf8_text().unwrap();
let confi = api.mean_text_conf();
let bref = b.as_ref();
/*
println!(
"[X: {}, Y: {}, W: {}, H: {}]: confidence: {}, text: {}",
bref.x, bref.y, bref.w, bref.h, confi, text
);*/
ocr_rects.push(TextBox {
id: format!("{}", boxid),
hint: text,
confidence: confi as u32,
x: bref.x,
y: bref.y,
height: bref.h as u32,
width: bref.w as u32,
});
boxid += 1;
}
let json = serde_json::to_string(&ocr_rects).unwrap();
println!("{}", json);
}

@ -0,0 +1,45 @@
use std::fs;
use ocr_json_common::TextBox;
use image::{Rgba};
use imageproc::drawing::{
draw_hollow_rect_mut,
draw_text_mut,
};
use imageproc::rect::Rect;
use std::env;
use std::path::Path;
use rusttype::{Font, Scale};
fn main() {
let img_file_name = if env::args().count() >= 2 {
env::args().nth(1).unwrap()
} else {
panic!("Please enter a target file path for image")
};
let json_file_name = if env::args().count() >= 3 {
env::args().nth(2).unwrap()
} else{
panic!("Please enter a target file path for json")
};
let json = fs::read_to_string(json_file_name).expect("There was an error reading the file.");
let ocr_rects: Vec<TextBox> = serde_json::from_str(&json).unwrap();
let image_path = Path::new(&img_file_name);
let red = Rgba([255u8, 0u8, 0u8, 255u8]);
let font_data: &[u8] = include_bytes!("../../fonts/DejaVuSansMono.ttf");
let font: Font<'static> = Font::try_from_bytes(font_data).expect("Error loading font.");
let mut img = image::open(image_path).unwrap();
// run OCR on each word bounding box
for rect in &ocr_rects {
draw_hollow_rect_mut(&mut img, Rect::at(rect.x, rect.y).of_size(rect.width, rect.height), red);
let y: u32 = rect.y as u32;
let x: u32 = (rect.x-25) as u32;
let text = String::from(rect.id.clone());
//let text = "⠨⠙⠕⠃⠗⠕⠙⠕⠱⠇⠊";
draw_text_mut(&mut img, red, x, y, Scale::uniform(20.0), &font, &text);
}
img.save("out.png").unwrap();
}

@ -0,0 +1,35 @@
use std::fs;
use ocr_json_common::TextBox;
use image::{Rgba};
use imageproc::drawing::{
draw_filled_rect_mut
};
use imageproc::rect::Rect;
use std::env;
use std::path::Path;
fn main() {
let img_file_name = if env::args().count() >= 2 {
env::args().nth(1).unwrap()
} else {
panic!("Please enter a target file path for image")
};
let json_file_name = if env::args().count() >= 3 {
env::args().nth(2).unwrap()
} else{
panic!("Please enter a target file path for json")
};
let json = fs::read_to_string(json_file_name).expect("There was an error reading the file.");
let ocr_rects: Vec<TextBox> = serde_json::from_str(&json).unwrap();
let image_path = Path::new(&img_file_name);
let white = Rgba([255u8, 255u8, 255u8, 255u8]);
let mut img = image::open(image_path).unwrap();
// run OCR on each word bounding box
for rect in &ocr_rects {
draw_filled_rect_mut(&mut img, Rect::at(rect.x, rect.y).of_size(rect.width, rect.height), white);
}
img.save("out.png").unwrap();
}

@ -0,0 +1,3 @@
fn main() {
println!("Hello, world!");
}

@ -0,0 +1 @@
Subproject commit 954e7a20b3fbe2bc613d1ec072aa8e4704f2c691
Loading…
Cancel
Save