Usage ===== Create the |extractor| instance ------------------------------- First, you need to import the |extractor| class : .. code-block:: python from chopper.extractor import Extractor Then you can create an |extractor| instance by explicitly instantiating one or by directly using |keep| and |discard| class methods : .. code-block:: python from chopper.extractor import Extractor # Instantiate style extractor = Extractor().keep('//div').discard('//a') # Class method style extractor = Extractor.keep('//div').discard('//a') Add Xpath expressions --------------------- The |extractor| instance allows you to chain multiple |keep| and |discard| .. code-block:: python from chopper.extractor import Extractor e = Extractor.keep('//div[p]').discard('//span').discard('//a').keep('strong') Extract contents ---------------- Once your |extractor| instance is created you can call the |extract| method on it. The |extract| method takes at least one argument that is the HTML to parse. If you want to also parse CSS, pass it as the second argument. .. warning:: Depending on the CSS content size, CSS parsing and cleaning can be really slow compared to HTML parsing and cleaning. .. code-block:: python from chopper.extractor import Extractor HTML = """
Main content
See morecontent
content
content
See more