Capturing screenshots of website with Python

brew install phantomjs <span class="c"><span class="hljs-comment"># or npm -g phantomjs </span></span>
pip install selenium

<span class="kn"><span class="hljs-keyword">from</span></span> <span class="nn">selenium</span> <span class="kn"><span class="hljs-keyword">import</span></span> <span class="n">webdriver</span>
<span class="n">depot</span> <span class="o">=</span> <span class="n">DepotManager</span><span class="o">.</span><span class="n">get</span><span class="p">()</span>
<span class="n">driver</span> <span class="o">=</span> <span class="n">webdriver</span><span class="o">.</span><span class="n">PhantomJS</span><span class="p">()</span>
<span class="n">driver</span><span class="o">.</span><span class="n">set_window_size</span><span class="p">(</span><span class="mi"><span class="hljs-number">1024</span></span><span class="p">,</span> <span class="mi"><span class="hljs-number">768</span></span><span class="p">)</span> <span class="c"><span class="hljs-comment"># set the window size that you need </span></span>
<span class="n">driver</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s"><span class="hljs-string">'https://github.com'</span></span><span class="p">)</span>
<span class="n">driver</span><span class="o">.</span><span class="n">save_screenshot</span><span class="p">(</span><span class="s"><span class="hljs-string">'github.png'</span></span><span class="p">)</span>
 

scanR: OCR as a service

scanR is a simple OCR API service that supports 32 languages and can extract text from images or PDF files.

Real time image to text – post us your image and get a response with the text inside.
No need to manage servers or infrastructure, simply call our API and get the text inside any image.