The Conservative Soul: Fundamentalism, Freedom, and the Future of the Right Paperback – October 9, 2007

if the acceptance and love of others as they are is the essence of Christianity, then the acceptance of our loneliness and doubt in a world far beyond our understanding is the core of all non-fundamentalist religion.

pg 219  Andrew Sullivan, The Conservative Soul

 

What religion can be at its most sublime is the fusion of that wonder we should really feel all the time in the presence of God. What religion can be at its most sublime is the fusion of that wonder with practical life. It is the marriage of the poetic and practical modes of experience. This does not require the imposition of fixed rules and doctrines, although they may be helpful guides from time to time.  It requires a constant reimagination of the potential of life  lived on earth as if it were heaven. It requires letting go of our desire not to let go. Jesus saw it in children. One of his most radical teachings was the notion that only if we become like children will we enter the kingdom of God.

Children love rituals, and their games are full of them. Perhaps because they are not yet fully formed, every moment matters more. We older types have sometimes become inured to the wonder and mystery of everything.

pg 222

 

These moments may come upon us when we least expect them. We may see flashes of eternity in the simple grin of a child in a game of hind and seek, in the approach of the tide on an autumn  afternoon, in the eyes of a lover in sex, or in grandmother’s ritual– but we know them when we see the. The key is to be open to them, because they happen all the time, all around us. But we are too “busy” to notice.

The opposite of this kind of faith is fundamentalism: the constant recourse to abstraction and authority or text.

Zero-Width Characters: Invisibly fingerprinting text

Journalists watch out—you may be unintentionally revealing sources.

Countermeasures for journalists or others engaged with leakers, in decreasing order of effectiveness:

  • Avoid releasing excerpts and raw documents.
  • Get the same documents from multiple leakers to ensure they have the exact same content on a byte-by-byte level.
  • Manually retype excerpts to avoid invisible characters and homoglyphs.
  • Keep excerpts short to limit the amount of information shared.
  • Use a tool that strips non-whitelisted characters from text before sharing it with others.

lambda-text-extractor

<span style="color: #24292e;">lambda-text-extractor</span> is a Python 3.6 app that works with the AWS Lambda architecture to extract text from common binary document formats.

Features

Some of its key features are:

  • out of the box support for many common binary document formats (see section on Supported Formats),
  • scalable PDF parsing using OCR in parallel using AWS Lambda and asyncio,
  • creation of text searchable PDFs after OCR,
  • serverless architecture makes deployment quick and easy,
  • detailed instruction for preparing libraries and dependencies necessary for processing binary documents, and
  • sensible unicode handling

Supported Formats

<span style="color: #24292e;">lambda-text-extractor</span> supports many common and legacy document formats:

  • Portable Document Format (<span style="color: #24292e;">.pdf</span>),
  • Microsoft Word 2, 6, 7, 97, 2000, 2002 and 2003 (<span style="color: #24292e;">.doc</span>) using Antiword with fallback to Catdoc,
  • Microsoft Word 2007 OpenXML files (<span style="color: #24292e;">.docx</span>) using python-docx,
  • Microsoft PowerPoint 2007 OpenXML files (<span style="color: #24292e;">.pptx</span>) using python-pptx,
  • Microsoft Excel 5.0, 97-2003, and 2007 OpenXML files (<span style="color: #24292e;">.xls</span><span style="color: #24292e;">.xlsx</span>) using xlrd,
  • OpenDocument 1.2 (<span style="color: #24292e;">.odm</span><span style="color: #24292e;">.odp</span><span style="color: #24292e;">.ods</span><span style="color: #24292e;">.odt</span><span style="color: #24292e;">.oth</span><span style="color: #24292e;">.otm</span><span style="color: #24292e;">.otp</span><span style="color: #24292e;">.ots</span><span style="color: #24292e;">.ott</span>) using odfpy,
  • Rich Text Format (<span style="color: #24292e;">.rtf</span>) using UnRTF v0.21.9,
  • XML files and HTML web pages (<span style="color: #24292e;">.html</span><span style="color: #24292e;">.htm</span><span style="color: #24292e;">.xml</span>) using lxml,
  • CSV files (<span style="color: #24292e;">.csv</span>) using Python csv module,
  • Images (<span style="color: #24292e;">.tiff</span><span style="color: #24292e;">.jpg</span><span style="color: #24292e;">.jpeg</span><span style="color: #24292e;">.png</span>) using Tesseract, and
  • Plain text files (<span style="color: #24292e;">.txt</span>)