I suggest you ...

Automatically fetch all pages of multi-page articles

Many sites split up articles into multiple pages. For example: http://www.gamasutra.com/view/feature/4225/persuasive_games_puzzling_the_.php

I love NewsRob's support for downloading the full feed and html contents to be read offline, but when an article spans multiple pages, NewRob only gets the first page. I wish it could automatically download all of them.

I know this would be ambitious and difficult, so I'm not holding my breath. There's definitely precedent though. Here are some thoughts:

- Many sites' "printable" versions of articles include the full contents, across all pages. You could detect @media print CSS stylesheets, http://w3schools.com/css/css_mediatypes.asp , and request/re-render the page using that stylesheet if available.

- If there's no @media print stylesheet, but there is a link that includes "Print" or "Printable", you could download that too, in the hopes that it has the full contents.

- There are browser plugins and greasemonkey scripts that detect and use "Next" and "Previous" links automatically. http://nextplease.mozdev.org/ is an example. You could use similar code to download every page.

83 votes
Vote
Sign in
Check!
(thinking…)
Reset
or sign in with
  • facebook
  • google
    Password icon
    I agree to the terms of service
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    Ryan B shared this idea  ·   ·  Admin →

    5 comments

    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      I agree to the terms of service
      Signed in as (Sign out)
      Submitting...
      • D. M. Miller commented  · 

        There is much variation between sites; some may have a link, others a button, either of which may be labeled "Print", or "Printer-friendly" or "Single Page" (or whatever).

        My suggestion is to have a setting in Manage Feed where users may enter text for a "single page" link. If specified, the initial page would be searched for such a link, and if found, followed, with the resulting page content captured instead of that of the initial page.

        Otherwise. behavior remains as it is today.

        Maybe also allow specifying link/button if that matters.

      • AdminMariano Kamp (Admin, newsrob) commented  · 

        Henrik, I talked to them already some time ago.

        The problems were, if I remember correctly, that it is too slow to be executed on the client side and that it is not that easy to do it with a WebView in the background.

        However Arc90 recently announced a node.js extension for DOM which could make it implementable if I would run my own server etc.

        But it's not a quick shot. I would expect however that other people may do this and I would include them as an option besides gwt/instapaper. We'll see.

      • Henrik Heimbuerger commented  · 

        I agree that NewsRob shouldn't implement this again. However, there are webapps doing this, e.g. Readability: http://lab.arc90.com/experiments/readability/
        (Works perfectly on the example page linked above.)

        Maybe another semi-collaboration is in order, similar to the Instapaper mobilizer. Not sure what Readability's intentions are to make it into a commercial product.
        I think it currently also mostly runs client-side.

      • droidgren commented  · 

        This is nothing that is up to Google reader and hence Newsrob. Ask for the sites authors to make better mobile-fitted.
        However for this particular site you are asking for I could easly "fix" what you are asking for with litle ripper called Betterfeed :)
        Just use this feed for the Articles and you are done.
        http://kronisk.bananbox.se/betterfeed/gamasutra

      Feedback and Knowledge Base