Package peppy :: Package lib :: Module textutil
[frames] | no frames]

Module textutil

Utility functions for operating on text

These text utilities have no dependencies on any other part of peppy, and therefore may be used independently of peppy.

Functions
 
piglatin(text)
Translate string to pig latin.
 
getMagicComments(bytes, headersize=1024)
Given a byte string, get the first two lines.
 
detectEncoding(bytes)
Search for "magic comments" that specify the encoding
tuple
parseEmacs(header)
Determine if the header specifies a major mode.
boolean
guessBinary(text, percentage)
Guess if this is a text or binary file.
Function Details

piglatin(text)

 

Translate string to pig latin.

Simple pig latin translator that properly capitalizes the resulting string, and skips over any leading or trailing non-alphabetic characters.

getMagicComments(bytes, headersize=1024)

 

Given a byte string, get the first two lines.

"Magic comments" appear in the first two lines of the file, and can indicate the encoding of the file or the major mode in which the file should be interpreted.

parseEmacs(header)

 

Determine if the header specifies a major mode.

Parse a potential emacs major mode specifier line into the mode and the optional variables. The mode may appears as any of:

 -*-C++-*-
 -*- mode: Python; -*-
 -*- mode: Ksh; var1:value1; var3:value9; -*-
Parameters:
  • header - first x bytes of the file to be loaded
Returns: tuple
two-tuple of the mode and a dict of the name/value pairs.

guessBinary(text, percentage)

 

Guess if this is a text or binary file.

Guess if the text in this file is binary or text by scanning through the first amount characters in the file and checking if some percentage is out of the printable ascii range.

Obviously this is a poor check for unicode files, so this is just a bit of a hack.

Parameters:
  • amount (int) - number of characters to check at the beginning of the file
  • percentage (number) - percentage of characters that must be in the printable ASCII range
Returns: boolean