Note that there are some explanatory texts on larger screens.

plurals
  1. POJSoup character encoding issue
    primarykey
    data
    text
    <p>I am using JSoup to parse content from <a href="http://www.latijnengrieks.com/vertaling.php?id=5368" rel="noreferrer">http://www.latijnengrieks.com/vertaling.php?id=5368</a> . this is a third party website and does not specify proper encoding. i am using the following code to load the data:</p> <pre><code>public class Loader { public static void main(String[] args){ String url = "http://www.latijnengrieks.com/vertaling.php?id=5368"; Document doc; try { doc = Jsoup.connect(url).timeout(5000).get(); Element content = doc.select("div.kader").first(); Element contenttableElement = content.getElementsByClass("kopje").first().parent().parent(); String contenttext = content.html(); String tabletext = contenttableElement.html(); contenttext = Jsoup.parse(contenttext).text(); contenttext = contenttext.replace("br2n", "\n"); tabletext = Jsoup.parse(tabletext.replaceAll("(?i)&lt;br[^&gt;]*&gt;", "br2n")).text(); tabletext = tabletext.replace("br2n", "\n"); String text = contenttext.substring(tabletext.length(), contenttext.length()); System.out.println(text); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } </code></pre> <p>this gives the following output:</p> <pre><code>Aeneas dwaalt rond in Troje en zoekt Cre?sa. Cre?sa is echter op de vlucht gestorven Plotseling verschijnt er een schim. Het is de schim van Cre?sa. De schim zegt:'De oorlog woedt!' Troje is ingenomen! Cre?sa is gestorven:'Vlucht!' Aeneas vlucht echter niet. Dan spreekt de schim:'Vlucht! Er staat jou een nieuw vaderland en een nieuw koninkrijk te wachten.' Dan pas gehoorzaamt Aeneas en vlucht. </code></pre> <p>is there any way the ? marks can be the original (ü) again in the output?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload