source: trunk/src/org/expeditee/io/WebParser.java@ 1415

Last change on this file since 1415 was 1415, checked in by bln4, 5 years ago

Renamed Frame.getItems() to Frame.getSortedItems() to better represent its functionality.

-> org.apollo.ApolloGestureActions
-> org.apollo.ApolloSystem
-> org.expeditee.actions.Actions
-> org.expeditee.actions.Debug
-> org.expeditee.actions.ExploratorySearchActions
-> org.expeditee.actions.JfxBrowserActions
-> org.expeditee.actions.Misc
-> org.expeditee.actions.Navigation
-> org.expeditee.actions.ScriptBase
-> org.expeditee.actions.Simple
-> org.expeditee.agents.ComputeTree
-> org.expeditee.agents.CopyTree
-> org.expeditee.agents.DisplayComet
-> org.expeditee.agents.DisplayTree
-> org.expeditee.agents.DisplayTreeLeaves
-> org.expeditee.agents.GraphFramesetLinks
-> org.expeditee.agents.TreeProcessor
-> org.expeditee.gio.gesture.StandardGestureActions
-> org.expeditee.gui.DisplayController
-> org.expeditee.gui.FrameCreator
-> org.expeditee.gui.FrameIO
-> org.expeditee.io.DefaultTreeWriter
-> org.expeditee.io.JavaWriter
-> org.expeditee.io.PDF2Writer
-> org.expeditee.io.TXTWriter
-> org.expeditee.io.WebParser
-> org.expeditee.io.flowlayout.XGroupItem
-> org.expeditee.items.Dot
-> org.expeditee.items.Item
-> org.expeditee.items.ItemUtils
-> org.expeditee.network.FrameShare
-> org.expeditee.stats.TreeStats


Created ItemsList class to wrap ArrayList<Item>. Frames now use this new class to store its body list (used for display) as well as its primaryBody and surrogateBody.

-> org.expeditee.agents.Format
-> org.expeditee.agents.HFormat
-> org.expeditee.gio.gesture.StandardGestureActions
-> org.expeditee.gui.Frame
-> org.expeditee.gui.FrameUtils


Refactorted Frame.setResort(bool) to Frame.invalidateSorted() to better function how it is intended to with a more accurate name.

-> org.expeditee.agents.Sort


When writing out .exp files and getting attributes to respond to LEFT + RIGHT click, boolean items are by default true. This has always been the case. An ammendment to this is that defaults can now be established.
Also added 'EnterClick' functionality. If cursored over a item with this property and you press enter, it acts as if you have clicked on it instead.

-> org.expeditee.assets.resources-public.framesets.authentication.1.exp to 6.exp
-> org.expeditee.gio.gesture.StandardGestureActions
-> org.expeditee.gio.input.KBMInputEvent
-> org.expeditee.gio.javafx.JavaFXConversions
-> org.expeditee.gio.swing.SwingConversions
-> org.expeditee.gui.AttributeUtils
-> org.expeditee.io.Conversion
-> org.expeditee.io.DefaultFrameWriter
-> org.expeditee.items.Item


Fixed a bug caused by calling Math.abs on Integer.MIN_VALUE returning unexpected result. Due to zero being a thing, you cannot represent Math.abs(Integer.MIN_VALUE) in a Integer object. The solution is to use Integer.MIN_VALUE + 1 instead of Integer.MIN_VALUE.

-> org.expeditee.core.bounds.CombinationBounds
-> org.expeditee.io.flowlayout.DimensionExtent


Recoded the contains function in EllipticalBounds so that intersection tests containing circles work correctly.

-> org.expeditee.core.bounds.EllipticalBounds


Added toString() to PolygonBounds to allow for useful printing during debugging.

-> org.expeditee.core.bounds.PolygonBounds

Implemented Surrogate Mode!

-> org.expeditee.encryption.io.EncryptedExpReader
-> org.expeditee.encryption.io.EncryptedExpWriter
-> org.expeditee.encryption.items.surrogates.EncryptionDetail
-> org.expeditee.encryption.items.surrogates.Label
-> org.expeditee.gui.FrameUtils
-> org.expeditee.gui.ItemsList
-> org.expeditee.items.Item
-> org.expeditee.items.Text


???? Use Integer.MAX_VALUE cast to a float instead of Float.MAX_VALUE. This fixed some bug which I cannot remember.

-> org.expeditee.gio.TextLayoutManager
-> org.expeditee.gio.swing.SwingTextLayoutManager


Improved solution for dealing with the F10 key taking focus away from Expeditee due to it being a assessibility key.

-> org.expeditee.gio.swing.SwingInputManager


Renamed variable visibleItems in FrameGraphics.paintFrame to itemsToPaintCanditates to better represent functional intent.

-> org.expeditee.gui.FrameGraphics


Improved checking for if personal resources exist before recreating them

-> org.expeditee.gui.FrameIO


Repeated messages to message bay now have a visual feedback instead of just a beep. This visual feedback is in the form of a count of the amount of times it has repeated.

-> org.expeditee.gui.MessageBay


Updated comment on the Vector class to explain what vectors are.

-> org.expeditee.gui.Vector


Added constants to represent all of the property keys in DefaultFrameReader and DefaultFrameWriter.

-> org.expeditee.io.DefaultFrameReader
-> org.expeditee.io.DefaultFrameWriter


Updated the KeyList setting to be more heirarcial with how users will store their Secrets.

-> org.expeditee.settings.identity.secrets.KeyList

File size: 56.9 KB
Line 
1/**
2 * WebParser.java
3 * Copyright (C) 2010 New Zealand Digital Library, http://expeditee.org
4 *
5 * This program is free software: you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation, either version 3 of the License, or
8 * (at your option) any later version.
9 *
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
14 *
15 * You should have received a copy of the GNU General Public License
16 * along with this program. If not, see <http://www.gnu.org/licenses/>.
17 */
18
19package org.expeditee.io;
20
21import java.io.File;
22import java.io.IOException;
23import java.lang.reflect.InvocationTargetException;
24import java.net.HttpURLConnection;
25import java.net.MalformedURLException;
26import java.net.URL;
27import java.text.SimpleDateFormat;
28/*
29 * JavaFX is not on the default java classpath until Java 8 (but is still included with Java 7), so your IDE will probably complain that the imports below can't be resolved.
30 * In Eclipse hitting'Proceed' when told 'Errors exist in project' should allow you to run Expeditee without any issues (although the JFX Browser widget will not display),
31 * or you can just exclude JfxBrowser, WebParser and JfxbrowserActions from the build path.
32 *
33 * If you are using Ant to build/run, 'ant build' will try to build with JavaFX jar added to the classpath.
34 * If this fails, 'ant build-nojfx' will build with the JfxBrowser, WebParser and JfxbrowserActions excluded from the build path.
35 */
36import java.util.Arrays;
37import java.util.Date;
38
39import javafx.animation.AnimationTimer;
40import javafx.application.Platform;
41import javafx.beans.value.ChangeListener;
42import javafx.beans.value.ObservableValue;
43import javafx.concurrent.Worker.State;
44import javafx.scene.SnapshotParameters;
45import javafx.scene.image.WritableImage;
46import javafx.scene.web.WebEngine;
47import javafx.scene.web.WebView;
48
49import netscape.javascript.JSObject;
50
51import org.expeditee.core.Colour;
52import org.expeditee.core.Font;
53import org.expeditee.core.Image;
54import org.expeditee.core.InOutReference;
55import org.expeditee.gio.EcosystemManager;
56import org.expeditee.gio.gesture.StandardGestureActions;
57import org.expeditee.gio.swing.SwingMiscManager;
58import org.expeditee.gui.DisplayController;
59import org.expeditee.gui.Frame;
60import org.expeditee.gui.FrameIO;
61import org.expeditee.gui.FrameUtils;
62import org.expeditee.gui.MessageBay;
63import org.expeditee.gui.MessageBay.Progress;
64import org.expeditee.items.ItemUtils;
65import org.expeditee.items.Justification;
66import org.expeditee.items.Picture;
67import org.expeditee.items.Text;
68import org.expeditee.items.widgets.JfxBrowser;
69import org.w3c.dom.Node;
70
71/**
72 * Methods to convert webpages to Expeditee frames
73 *
74 * @author ngw8
75 * @author jts21
76 */
77public class WebParser {
78
79
80 /**
81 * Loads a webpage and renders it as Expeditee frame(s)
82 *
83 * @param URL
84 * Page to load
85 * @param frame
86 * The Expeditee frame to output the converted page to
87 */
88 public static void parseURL(final String URL, final Frame frame) {
89 try {
90 Platform.runLater(new Runnable() {
91 @Override
92 public void run() {
93 try {
94 WebEngine webEngine = new WebEngine(URL);
95 loadPage(webEngine, frame);
96 } catch (Exception e) {
97 e.printStackTrace();
98 }
99 }
100 });
101 } catch (Exception e) {
102 e.printStackTrace();
103 }
104 }
105
106 protected static void loadPage(final WebEngine webEngine, final Frame frame) throws Exception {
107 webEngine.getLoadWorker().stateProperty().addListener(new ChangeListener<State>() {
108
109 @Override
110 public void changed(ObservableValue<? extends State> ov, State oldState, State newState) {
111
112 switch (newState) {
113 case READY: // READY
114 // MessageBay.displayMessage("WebEngine ready");
115 break;
116 case SCHEDULED: // SCHEDULED
117 // MessageBay.displayMessage("Scheduled page load");
118 break;
119 case RUNNING: // RUNNING
120 System.out.println("Loading page!");
121 // MessageBay.displayMessage("WebEngine running");
122 break;
123 case SUCCEEDED: // SUCCEEDED
124 // MessageBay.displayMessage("Finished loading page");
125 System.out.println("Parsing page!");
126 webEngine.executeScript("window.resizeTo(800, 800);"
127 + "document.body.style.width = '1000px'");
128 parsePage(webEngine, frame);
129 System.out.println("Parsed page!");
130 break;
131 case CANCELLED: // CANCELLED
132 MessageBay.displayMessage("Cancelled loading page");
133 break;
134 case FAILED: // FAILED
135 MessageBay.displayMessage("Failed to load page");
136 break;
137 }
138 }
139 });
140 }
141
142 /**
143 * Converts a loaded page to Expeditee frame(s)
144 *
145 * @param webEngine
146 * The JavaFX WebEngine in which the page to be converted is loaded
147 * @param frame
148 * The Expeditee frame to output the converted page to
149 */
150 public static void parsePage(final WebEngine webEngine, final Frame frame) {
151 try {
152 Platform.runLater(new Runnable() {
153 @Override
154 public void run() {
155 try {
156 Progress progressBar = MessageBay.displayProgress("Converting web page");
157
158 Node doc = (Node) webEngine.executeScript("document.body");
159
160 JSObject window = (JSObject) webEngine.executeScript("window");
161
162 frame.setBackgroundColor(rgbStringToColor((String) ((JSObject) (window.call("getComputedStyle", new Object[] { doc }))).call("getPropertyValue",
163 new Object[] { "background-color" })));
164
165 // Functions to be used later in JavaScript
166 webEngine.executeScript(""
167 + "function addToSpan(text) {"
168 + " span = document.createElement('wordSpan');"
169 + " span.textContent = text;"
170 + " par.insertBefore(span, refNode);"
171 // Checking if the current word is on a new line (i.e. lower than the previous word)
172 + " if (prevSpan !== null && span.getBoundingClientRect().top > prevSpan.getBoundingClientRect().top) {"
173 // If it is, prepend a new line character to it. The new line characters doesn't affect the rendered HTML
174 + " span.textContent = '\\n' + span.textContent;"
175
176 // Checking if the previous word is horizontally aligned with the one before it.
177 // If it is, merge the text of the two spans
178 + " if ( prevPrevSpan !== null && prevPrevSpan.getBoundingClientRect().left == prevSpan.getBoundingClientRect().left) {"
179 + " prevPrevSpan.textContent = prevPrevSpan.textContent + prevSpan.textContent;"
180 + " par.removeChild(prevSpan);"
181 + " } else {"
182 + " prevPrevSpan = prevSpan;"
183 + " }"
184 + " prevSpan = span;"
185 + " } else if ( prevSpan !== null) {"
186 // Word is on the same line as the previous one, so merge the second into the span of the first
187 + " prevSpan.textContent = prevSpan.textContent + span.textContent;"
188 + " par.removeChild(span);"
189 + " } else {"
190 + " prevSpan = span;"
191 + " }"
192 + "}"
193
194 + "function splitIntoWords(toSplit) {"
195 + " var words = [];"
196 + " var pattern = /\\s+/g;"
197 + " var words = toSplit.split(pattern);"
198 + ""
199 + " for (var i = 0; i < words.length - 1; i++) {"
200 + " words[i] = words[i] + ' ';"
201 + " }"
202 + " return words;"
203 + "}"
204 );
205
206 // Using Javascript to get an array of all the text nodes in the document so they can be wrapped in spans. Have to
207 // loop through twice (once to build the array and once actually going through the array, otherwise when the
208 // textnode is removed from the document items end up being skipped)
209 JSObject textNodes = (JSObject) webEngine.executeScript(""
210 + "function getTextNodes(rootNode){"
211 + "var node;"
212 + "var textNodes=[];"
213 + "var walk = document.createTreeWalker(rootNode, NodeFilter.SHOW_TEXT);"
214 + "while(node=walk.nextNode()) {"
215 + "if((node.textContent.trim().length > 0)) { "
216 + "textNodes.push(node);"
217 + "}"
218 + "}"
219 + "return textNodes;"
220 + "}; "
221 + "getTextNodes(document.body)"
222 );
223
224 int nodesLength = (Integer) textNodes.getMember("length");
225
226 // Looping through all the text nodes in the document
227 for (int j = 0; j < nodesLength; j++) {
228 Node currentNode = (Node) textNodes.getSlot(j);
229
230 // Making the current node accessible in JavaScript
231 window.setMember("currentNode", currentNode);
232
233 webEngine.executeScript(""
234 + "var span = null, prevSpan = null, prevPrevSpan = null;"
235
236 // Removing repeated whitespace from the text node's content then splitting it into individual words
237 + "var textContent = currentNode.textContent.replace(/\\n|\\r/g, '').replace(/\\s+/g, ' ');"
238 + "var words = splitIntoWords(textContent);"
239
240 + "var refNode = currentNode.nextSibling;"
241 + "var par = currentNode.parentElement;"
242 + "currentNode.parentElement.removeChild(currentNode);"
243
244 + "for (var i = 0; i < words.length; i++) {"
245 + " addToSpan(words[i]);"
246 + "}"
247
248 + "if (prevPrevSpan !== null && prevPrevSpan.getBoundingClientRect().left == prevSpan.getBoundingClientRect().left) {"
249 + " prevPrevSpan.textContent = prevPrevSpan.textContent + prevSpan.textContent;"
250 + " par.removeChild(prevSpan);"
251 + "}"
252 );
253
254 // Will never reach 100% here, as the processing is not quite finished - progress is set to 100% at the end of
255 // the addPageToFrame loop below
256 progressBar.set((100 * (j)) / nodesLength);
257 }
258
259 // Finding all links within the page, then setting the href attribute of all their descendants to be the same
260 // link/URL.
261 // This is needed because there is no apparent and efficient way to check if an element is a child of a link when
262 // running through the document when added each element to Expeditee
263 webEngine.executeScript(""
264 + "var anchors = document.getElementsByTagName('a');"
265 + ""
266 + "for (var i = 0; i < anchors.length; i++) {"
267 + "var currentAnchor = anchors.item(i);"
268 + "var anchorDescendants = currentAnchor.querySelectorAll('*');"
269 + "for (var j = 0; j < anchorDescendants.length; j++) {"
270 + "anchorDescendants.item(j).href = currentAnchor.href;"
271 + "}"
272 + "}"
273 );
274
275 WebParser.addPageToFrame(doc, window, webEngine, frame);
276
277 progressBar.set(100);
278
279 } catch (Exception e) {
280 e.printStackTrace();
281 }
282 System.out.println("Parsed frame");
283 FrameUtils.Parse(frame);
284 frame.setChanged(true);
285 FrameIO.SaveFrame(frame);
286 }
287 });
288 } catch (Exception e) {
289 e.printStackTrace();
290 }
291 }
292
293 /**
294 * Converts a loaded page to Expeditee frame(s)
295 *
296 * @param webEngine
297 * The JavaFX WebEngine in which the page to be converted is loaded
298 * @param frame
299 * The Expeditee frame to output the converted page to
300 */
301 public static void parsePageSimple(final JfxBrowser browserWidget, final WebEngine webEngine, final WebView webView, final Frame frame) {
302 try {
303
304 final int verticalScrollPerPage = (int) (DisplayController.getFramePaintAreaHeight() * 0.85);
305 final int horizontalScrollPerPage = (int) (DisplayController.getFramePaintAreaWidth() * 0.85);
306
307 Platform.runLater(new Runnable() {
308
309 @Override
310 public void run() {
311 browserWidget.setOverlayVisible(true);
312
313 // Webview area is set to slightly larger than the size of a converted page, to give some overlap between each page
314 browserWidget.setWebViewSize(horizontalScrollPerPage * 1.1, verticalScrollPerPage * 1.1);
315 browserWidget.setScrollbarsVisible(false);
316 }
317 });
318
319 final Object notifier = new Object();
320
321 final InOutReference<Integer> verticalCount = new InOutReference<Integer>(0);
322 final InOutReference<Integer> horizontalCount = new InOutReference<Integer>(0);
323
324 final InOutReference<Integer> pagesVertical = new InOutReference<Integer>(1);
325 final InOutReference<Integer> pagesHorizontal = new InOutReference<Integer>(1);
326
327 final String pageTitle;
328
329 if (webEngine.getTitle() != null) {
330 pageTitle = webEngine.getTitle();
331 } else {
332 pageTitle = "Untitled Page";
333 }
334
335 final Progress progressBar = MessageBay.displayProgress("Converting web page");
336
337 final Frame frameset = FrameIO.CreateNewFrameset(FrameIO.ConvertToValidFramesetName((new SimpleDateFormat("yy-MM-dd-HH-mm-ss").format(new Date())) + pageTitle));
338
339 frameset.setTitle(pageTitle);
340 frameset.getTitleItem().setSize(14);
341
342 WebParser.addButton("Return to original frame", frame.getName(), null, 200, frameset, null, 0, 10, null);
343
344 Text link = DisplayController.getCurrentFrame().addText(DisplayController.getMouseX(), DisplayController.getMouseY(), pageTitle, null);
345 link.setLink(frameset.getName());
346
347 StandardGestureActions.pickup(link);
348
349 // Timer that fires every time JFX is redrawn. After a few redraws, the handle method of this takes a screenshot of the page,
350 // adds it to the frame, then adds the text on top
351 AnimationTimer timer = new AnimationTimer() {
352
353 int frameCount = 0;
354
355 Frame frameToAddTo = frameset;
356 int thumbWidth = 100;
357
358 @Override
359 public void handle(long arg0) {
360 // Must wait 2 frames before taking a snapshot of the webview, otherwise JavaFX won't have redrawn
361 if (frameCount++ > 1) {
362 frameCount = 0;
363 this.stop();
364
365 verticalCount.set(verticalCount.get() + 1);
366
367 frameToAddTo = FrameIO.CreateFrame(frameToAddTo.getFramesetName(), pageTitle, null);
368 frameToAddTo.removeAllItems(frameToAddTo.getSortedItems());
369
370 try {
371 // removing the CSS that hides the text (otherwise the text would not pass the visibility check that is run on
372 // it before adding it to the frame)
373 webEngine.executeScript("cssHide.innerHTML = '';");
374
375 JSObject window = (JSObject) webEngine.executeScript("window");
376
377 int visibleWidth = (Integer) webEngine.executeScript("window.innerWidth");
378 int visibleHeight = (Integer) webEngine.executeScript("window.innerHeight");
379
380 WebParser.addTextToFrame(visibleWidth, visibleHeight, window, webEngine, frameToAddTo);
381
382 FrameIO.SaveFrame(frameToAddTo);
383 } catch (Exception ex) {
384 ex.printStackTrace();
385 }
386
387 webEngine.executeScript(""
388 // Setting all text to be hidden before taking the screenshot
389 + "cssHide.appendChild(document.createTextNode(wordSpanHiddenStyle));");
390
391 WritableImage img = new WritableImage((int)webView.getWidth(), (int)webView.getHeight());
392
393 webView.snapshot(new SnapshotParameters(), img);
394
395 // Getting a BufferedImage from the JavaFX image
396 //BufferedImage image = SwingFXUtils.fromFXImage(img, null);
397 Image image = SwingMiscManager.getImageForJavaFXImage(img);
398
399 try {
400 // TODO: tidy. cts16
401 //int hashcode = Arrays.hashCode(image.getData().getPixels(0, 0, image.getWidth(), image.getHeight(), (int[]) null));
402 int hashcode = image.hashCode();
403
404 File out = new File(FrameIO.IMAGES_PATH + "webpage-" + Integer.toHexString(hashcode) + ".png");
405 out.mkdirs();
406 image.writeToDisk("png", out);
407
408 // Adding the image to the frame
409 frameToAddTo.addText(0, 0, "@i: " + out.getName(), null);
410
411 // Adding thumbnail to the overview page
412 Text thumb = frameset.addText((int) (thumbWidth * 1.1 * horizontalCount.get()) + 10,
413 (int) ((((float) thumbWidth / image.getWidth()) * image.getHeight()) * 1.1 * verticalCount.get()),
414 "@i: " + out.getName() + " " + thumbWidth,
415 null);
416
417 thumb.setLink(frameToAddTo.getName());
418 thumb.setBorderColor(Colour.LIGHT_GREY);
419 thumb.setThickness(1);
420
421 // Button to go to the next frame/page
422 WebParser.addButton("Next", null, "next", 70, frameToAddTo, null, 0, 10, null);
423
424 // Button to go to the previous frame/page
425 if (verticalCount.get() > 1 || horizontalCount.get() > 0) {
426 WebParser.addButton("Previous", null, "previous", 70, frameToAddTo, null, 85, 10, null);
427 }
428
429 // Button to return to the index/overview page
430 WebParser.addButton("Index", frameset.getName(), null, 70, frameToAddTo, null, null, 10, 5);
431
432 FrameIO.SaveFrame(frameToAddTo);
433 FrameIO.SaveFrame(frameset);
434
435 } catch (IOException e) {
436 e.printStackTrace();
437 }
438
439 image.releaseImage();
440
441 synchronized (notifier) {
442 // Notifying that the timer has finished
443 notifier.notify();
444 }
445 }
446 }
447 };
448
449 Platform.runLater(new Runnable() {
450 @Override
451 public void run() {
452 try {
453 JSObject window = (JSObject) webEngine.executeScript("window");
454
455 webEngine.executeScript(""
456 // Initializing the counter used when scrolling the page
457 + "var scrollCounter = 0;"
458 + "var scrollCounterHorizontal = 0;"
459
460 // Storing the current scroll position
461 + "var originalScrollX = window.pageXOffset;"
462 + "var originalScrollY = window.pageYOffset;");
463
464 window.setMember("horizontalScrollPerPage", horizontalScrollPerPage);
465 window.setMember("verticalScrollPerPage", verticalScrollPerPage);
466
467
468
469 // The scrollPerPage will always be less than the page's height, due to the overlap being added/allowed for between pages,
470 // but if the webpage fits in a single converted page, there's no need for any overlap, so just use 1 as the number of pages
471 if((Boolean) webEngine.executeScript("document.documentElement.scrollHeight > window.innerHeight")) {
472 pagesVertical.set((int) Math.ceil((Integer) webEngine.executeScript("document.documentElement.scrollHeight") / (float) verticalScrollPerPage));
473 }
474
475 if((Boolean) webEngine.executeScript("document.documentElement.scrollWidth > window.innerWidth")) {
476 pagesHorizontal.set((int) Math.ceil((Integer) webEngine.executeScript("document.documentElement.scrollWidth") / (float) horizontalScrollPerPage));
477 }
478
479 System.out.println(webEngine.executeScript("document.documentElement.scrollWidth") + "/" + horizontalScrollPerPage);
480 System.out.println(pagesVertical.get() + "x" + pagesHorizontal.get());
481
482 // Setting up the element that contains the CSS to hide all text. Also hiding readability mode buttons.
483 // This is wiped before the text is converted, then re-added before taking the screenshot
484 webEngine.executeScript(""
485 + "var cssHide = document.createElement('style');"
486 + "cssHide.type = 'text/css';"
487 + "var wordSpanHiddenStyle = 'WordSpan, #readOverlay #readTools { visibility: hidden !important;}';"
488 + "cssHide.appendChild(document.createTextNode(wordSpanHiddenStyle));"
489 + "document.getElementsByTagName('head')[0].appendChild(cssHide);"
490 );
491
492 // Replacing line breaks in all <pre> tags with <br> tags, otherwise they are lost during the conversion
493 webEngine.executeScript(""
494 + "var pres = document.getElementsByTagName ('pre');"
495 + "for(var i = 0; i < pres.length; i++){"
496 + " pres[i].innerHTML = pres[i].innerHTML.replace(/\\n|\\r/g, '<br />');"
497 + "}");
498
499 // Functions to be used later in JavaScript
500 webEngine.executeScript(""
501 + "function addToSpan(text) {"
502 + " span = document.createElement('wordSpan');"
503 + " span.textContent = text;"
504 + " par.insertBefore(span, refNode);"
505 + " if (prevSpan !== null && span.getBoundingClientRect().top > prevSpan.getBoundingClientRect().top) {"
506 + " span.textContent = '\\n' + span.textContent;"
507 + " if ( prevPrevSpan !== null && prevPrevSpan.getBoundingClientRect().left == prevSpan.getBoundingClientRect().left) {"
508 + " prevPrevSpan.textContent = prevPrevSpan.textContent + prevSpan.textContent;"
509 + " par.removeChild(prevSpan);"
510 + " } else {"
511 + " prevPrevSpan = prevSpan;"
512 + " }"
513 + " prevSpan = span;"
514 + " } else if ( prevSpan !== null) {"
515 + " prevSpan.textContent = prevSpan.textContent + span.textContent;"
516 + " par.removeChild(span);"
517 + " } else {"
518 + " prevSpan = span;"
519 + " }"
520 + "}"
521
522 + "function splitIntoWords(toSplit) {"
523 + " var words = [];"
524 + " var pattern = /\\s+/g;"
525 + " var words = toSplit.split(pattern);"
526 + ""
527 + " for (var i = 0; i < words.length - 1; i++) {"
528 + " words[i] = words[i] + ' ';"
529 + " }"
530 + " return words;"
531 + "}"
532 );
533
534 // Using Javascript to get an array of all the text nodes in the document so they can be wrapped in spans. Have to
535 // loop through twice (once here to build the array and once later actually going through the array, otherwise when the
536 // textnode is removed from the document items end up being skipped)
537 webEngine.executeScript(""
538 + "var node;"
539 + "var textNodes=[];"
540 + "var walk = document.createTreeWalker(document.body, NodeFilter.SHOW_TEXT);"
541 );
542
543 while(webEngine.executeScript("node=walk.nextNode()") != null && browserWidget.isParserRunning()) {
544
545 webEngine.executeScript(""
546 + "if((node.textContent.trim().length > 0)) { "
547 + "textNodes.push(node);"
548 + "}"
549 );
550 }
551
552 JSObject textNodes = (JSObject) webEngine.executeScript("textNodes");
553
554 int nodesLength = (Integer) textNodes.getMember("length");
555
556 // Looping through all the text nodes in the document
557 for (int j = 0; j < nodesLength && browserWidget.isParserRunning(); j++) {
558 Node currentNode = (Node) textNodes.getSlot(j);
559
560 // Making the current node accessible in JavaScript
561 window.setMember("currentNode", currentNode);
562
563 webEngine.executeScript(""
564 + "var span = null, prevSpan = null, prevPrevSpan = null;"
565
566 // Removing repeated whitespace from the text node's content then splitting it into individual words
567 + "var textContent = currentNode.textContent.replace(/\\n|\\r/g, '').replace(/\\s+/g, ' ');"
568 + "var words = splitIntoWords(textContent);"
569
570 + "var refNode = currentNode.nextSibling;"
571 + "var par = currentNode.parentElement;"
572 + "currentNode.parentElement.removeChild(currentNode);"
573
574 + "for (var i = 0; i < words.length; i++) {"
575 + " addToSpan(words[i]);"
576 + "}"
577
578 + "if (prevPrevSpan !== null && prevPrevSpan.getBoundingClientRect().left == prevSpan.getBoundingClientRect().left) {"
579 + " prevPrevSpan.textContent = prevPrevSpan.textContent + prevSpan.textContent;"
580 + " par.removeChild(prevSpan);"
581 + "}"
582 );
583
584 // Will never reach 100% here, as the processing is not quite finished - progress is set to 100% at the end of
585 // the addPageToFrame loop below
586 try {
587 progressBar.set((50 * (j + 1)) / nodesLength);
588 } catch (Exception e) {
589 // Seems to be a bug somewhere along the line when updating the progressbar, so am catching any exception
590 // thrown here to avoid it stuffing up the rest of the parsing
591 e.printStackTrace();
592 }
593 }
594
595 // Finding all links within the page, then setting the href attribute of all their descendants to be the same
596 // link/URL.
597 // This is needed because there is no apparent and efficient way to check if an element is a child of a link when
598 // running through the document when added each element to Expeditee
599 webEngine.executeScript(""
600 + "var anchors = document.getElementsByTagName('a');"
601 + ""
602 + "for (var i = 0; i < anchors.length; i++) {"
603 + "var currentAnchor = anchors.item(i);"
604 + "var anchorDescendants = currentAnchor.querySelectorAll('*');"
605 + "for (var j = 0; j < anchorDescendants.length; j++) {"
606 + "anchorDescendants.item(j).href = currentAnchor.href;"
607 + "}"
608 + "}"
609 );
610
611 } catch (Exception ex) {
612 ex.printStackTrace();
613 }
614
615 synchronized (notifier) {
616 notifier.notify();
617 }
618 }
619 });
620
621 synchronized (notifier) {
622 try {
623 // Waiting for the page setup (splitting into spans) to finish
624 notifier.wait();
625 } catch (InterruptedException e) {
626 // TODO Auto-generated catch block
627 e.printStackTrace();
628 }
629 }
630
631 // Loop that scrolls the page horizontally
632 for(int i = 0; i < pagesHorizontal.get() && browserWidget.isParserRunning(); i++) {
633
634 Platform.runLater(new Runnable() {
635 @Override
636 public void run() {
637 try {
638 // Scrolling down the page
639 webEngine.executeScript(""
640 + "scrollCounter = 0;"
641 + "window.scrollTo(scrollCounterHorizontal * horizontalScrollPerPage, 0);"
642 + "scrollCounterHorizontal = scrollCounterHorizontal+1;");
643
644 } catch (Exception e) {
645 e.printStackTrace();
646 }
647 }
648 });
649
650 // Loop that scrolls the page vertically (for each horizontal scroll position)
651 for(int j = 0; j < pagesVertical.get() && browserWidget.isParserRunning(); j++) {
652
653 try {
654 progressBar.set((int) (50 + ((float)(j+1)/(pagesVertical.get() * pagesHorizontal.get()) + ((float)(i) / pagesHorizontal.get())) * 50));
655 } catch (Exception e) {
656 e.printStackTrace();
657 }
658
659 Platform.runLater(new Runnable() {
660 @Override
661 public void run() {
662 try {
663 // Scrolling down the page
664 webEngine.executeScript(""
665 + "window.scrollTo(window.pageXOffset, scrollCounter * verticalScrollPerPage);"
666 + "scrollCounter = scrollCounter+1;");
667
668 synchronized (notifier) {
669 notifier.notify();
670 }
671
672 } catch (Exception e) {
673 e.printStackTrace();
674 }
675 }
676 });
677
678 synchronized (notifier) {
679 try {
680 // Waiting for the page to be scrolled
681 notifier.wait();
682 } catch (InterruptedException e) {
683 // TODO Auto-generated catch block
684 e.printStackTrace();
685 }
686 }
687
688 timer.start();
689
690 synchronized (notifier) {
691 try {
692 // Waiting for the timer thread to finish before looping again
693 notifier.wait();
694 } catch (InterruptedException e) {
695 // TODO Auto-generated catch block
696 e.printStackTrace();
697 }
698 }
699
700 }
701
702 horizontalCount.set(horizontalCount.get() + 1);
703 verticalCount.set(0);
704 }
705
706 if(browserWidget.isParserRunning()) {
707 progressBar.set(100);
708 } else {
709 MessageBay.displayMessage("Web page conversion cancelled");
710 }
711
712 browserWidget.parserFinished();
713
714 Platform.runLater(new Runnable() {
715 @Override
716 public void run() {
717 // Scrolling to the original position on the page
718 webEngine.executeScript("window.scrollTo(originalScrollX, originalScrollY)");
719 // Reloading the page once the parsing is done - only realistic way to reset (i.e. remove all the added WordSpan tags)
720 // the page
721 webEngine.reload();
722 }
723 });
724
725 } catch (Exception ex) {
726 ex.printStackTrace();
727 }
728
729 Platform.runLater(new Runnable() {
730
731 @Override
732 public void run() {
733 browserWidget.setOverlayVisible(false);
734 browserWidget.rebindWebViewSize();
735 browserWidget.setScrollbarsVisible(true);
736 }
737 });
738
739 }
740
741 /**
742 * @param rgbString
743 * string in the format <i>rgb(x,x,x)</i> or <i>rgba(x,x,x,x)</i>
744 * @return A Color object that should match the rgb string passed int. Returns null if alpha is 0
745 */
746 private static Colour rgbStringToColor(String rgbString) {
747
748 if (rgbString == null) {
749 return null;
750 }
751
752 // Splitting the string into 'rgb' and 'x, x, x'
753 String[] tmpStrings = rgbString.split("\\(|\\)");
754
755 // Splitting up the RGB(A) components into an array
756 tmpStrings = tmpStrings[1].split(",");
757
758 int[] components = new int[4];
759 Arrays.fill(components, 255);
760
761 for (int i = 0; i < tmpStrings.length; i++) {
762 Float d = Float.parseFloat(tmpStrings[i].trim());
763
764 components[i] = Math.round(d);
765 }
766
767 if (components[3] > 0) {
768 return Colour.FromRGBA255(components[0], components[1], components[2], components[3]);
769 } else {
770 return null;
771 }
772 }
773
774 /**
775 * @param rootElement
776 * Element that will be converted (including all sub-elements)
777 * @param backgroundColor
778 * String to be used as the background color of this element when added. In the format "rgb(x,x,x)" or "rgba(x,x,x,x)"
779 * @param window
780 * 'window' from Javascript
781 * @param webEngine
782 * Web engine that the page is loaded in
783 * @param frame
784 * Expeditee frame to add the converted page to
785 * @throws IllegalArgumentException
786 * @throws IllegalAccessException
787 */
788 private static void addPageToFrame(Node rootElement, JSObject window, WebEngine webEngine, Frame frame) throws InvocationTargetException, IllegalAccessException,
789 IllegalArgumentException {
790
791 Node currentNode = rootElement;
792
793 if (currentNode.getNodeType() == Node.TEXT_NODE || currentNode.getNodeType() == Node.ELEMENT_NODE) {
794
795 JSObject style;
796 JSObject bounds;
797
798 if (currentNode.getNodeType() == Node.TEXT_NODE) {
799 // CSS style for the element
800 style = (JSObject) window.call("getComputedStyle", new Object[] { currentNode.getParentNode() });
801
802 // Getting a rectangle that represents the area and position of the element
803 bounds = (JSObject) ((JSObject) currentNode.getParentNode()).call("getBoundingClientRect", new Object[] {});
804 } else {
805 style = (JSObject) window.call("getComputedStyle", new Object[] { currentNode });
806
807 bounds = (JSObject) ((JSObject) currentNode).call("getBoundingClientRect", new Object[] {});
808 }
809
810 // Bounding rectangle position is relative to the current view, so scroll position must be added to x/y
811 // TODO: This doesn't check if an element or any of its parent elements have position:fixed set - the only
812 // way to check seems to be to walking through the element's parents until the document root is reached
813 float x = Float.valueOf(bounds.getMember("left").toString()) + Float.valueOf(webEngine.executeScript("window.pageXOffset").toString());
814 float y = Float.valueOf(bounds.getMember("top").toString()) + Float.valueOf(webEngine.executeScript("window.pageYOffset").toString());
815
816 float width = Float.valueOf(bounds.getMember("width").toString());
817 float height = Float.valueOf(bounds.getMember("height").toString());
818
819 // Checking if the element is actually visible on the page
820 if (WebParser.elementVisible(x, y, width, height, style)) {
821
822 // Filtering the node type, starting with text nodes
823 if (currentNode.getNodeType() == Node.TEXT_NODE) {
824
825 String fontSize = ((String) style.call("getPropertyValue", new Object[] { "font-size" }));
826
827 // Trimming off the units (always px) from the font size
828 fontSize = fontSize.substring(0, fontSize.length() - 2);
829
830 // Always returns in format "rgb(x,x,x)" or "rgba(x,x,x,x)"
831 String color = (String) style.call("getPropertyValue", new Object[] { "color" });
832
833 // Always returns in format "rgb(x,x,x)" or "rgba(x,x,x,x)"
834 String bgColorString = (String) style.call("getPropertyValue", new Object[] { "background-color" });
835
836 String align = (String) style.call("getPropertyValue", new Object[] { "text-align" });
837
838 // Returns comma-separated list of typefaces
839 String typeface = (String) style.call("getPropertyValue", new Object[] { "font-family" });
840
841 String[] typefaces = typeface.split(", |,");
842
843 String weight = (String) style.call("getPropertyValue", new Object[] { "font-weight" });
844
845 String fontStyle = (String) style.call("getPropertyValue", new Object[] { "font-style" });
846
847 // Returns "normal" or a value in pixels (e.g. "10px")
848 String letterSpacing = (String) style.call("getPropertyValue", new Object[] { "letter-spacing" });
849
850 // Returns a value in pixels (e.g. "10px")
851 String lineHeight = (String) style.call("getPropertyValue", new Object[] { "line-height" });
852
853 String textTransform = (String) style.call("getPropertyValue", new Object[] { "text-transform" });
854
855 String linkUrl = (String) ((JSObject) currentNode.getParentNode()).getMember("href");
856
857 Boolean fontFound = false;
858 Font font = null;
859
860 // Looping through all font-families listed in the element's CSS until one that is installed is
861 // found, or the end of the list is reached, in which case the default font is used
862 for (int j = 0; j < typefaces.length && !fontFound; j++) {
863 if (typefaces[j].toLowerCase().equals("sans-serif")) {
864 typefaces[j] = "Arial Unicode MS";
865 } else if (typefaces[j].toLowerCase().equals("serif")) {
866 typefaces[j] = "Times New Roman";
867 } else if ((typefaces[j].toLowerCase().equals("arial"))) {
868 // Have to use Arial Unicode, otherwise unicode characters display incorrectly
869 typefaces[j] = "Arial Unicode MS";
870 }
871
872 // Regex will remove any inverted commas surrounding multi-word typeface names
873 String familyName = typefaces[j].replaceAll("^'|'$", "");
874 font = new Font(familyName);
875 font.setStyle(Font.Style.PLAIN);
876 font.setSize(12);
877
878 // Check whether the font was found
879 if (EcosystemManager.getFontManager().getActualFont(font).getFamilyName().toLowerCase().equals(familyName.toLowerCase())) {
880 fontFound = true;
881 }
882 }
883
884 if (!fontFound) {
885 font = new Font("Times New Roman");
886 font.setStyle(Font.Style.PLAIN);
887 font.setSize(12);
888 }
889
890 String fontStyleComplete = "";
891
892 int weightInt = 0;
893
894 try {
895 weightInt = Integer.parseInt(weight);
896 } catch (NumberFormatException nfe) {
897 // Use default value as set above
898 }
899
900 // checking if font is bold - i.e. 'bold', 'bolder' or weight over 500
901 if (weight.toLowerCase().startsWith("bold") || weightInt > 500) {
902 fontStyleComplete = fontStyleComplete.concat("bold");
903 }
904
905 if (fontStyle.toLowerCase().equals("italic") || fontStyle.toLowerCase().equals("oblique")) {
906 fontStyleComplete = fontStyleComplete.concat("italic");
907 }
908
909 float fontSizeFloat = 12;
910
911 try {
912 fontSizeFloat = Float.valueOf(fontSize);
913 } catch (NumberFormatException nfe) {
914 // Use default value as set above
915 }
916
917 float letterSpacingFloat = -0.008f;
918
919 try {
920 letterSpacingFloat = (Integer.parseInt(letterSpacing.substring(0, letterSpacing.length() - 2)) / (fontSizeFloat));
921 } catch (NumberFormatException nfe) {
922 // Use default value as set above
923 }
924
925 float lineHeightInt = -1;
926
927 try {
928 lineHeightInt = (Float.parseFloat(lineHeight.substring(0, lineHeight.length() - 2)));
929 } catch (NumberFormatException nfe) {
930 // Use default value as set above
931 }
932
933 Text t;
934
935 String textContent = currentNode.getTextContent().replaceAll("[^\\S\\n]+", " ");
936 textContent = textContent.replaceAll("^(\\s)(\\n|\\r)", "");
937
938 if (textTransform.equals("uppercase")) {
939 textContent = textContent.toUpperCase();
940 } else if (textTransform.equals("lowercase")) {
941 textContent = textContent.toUpperCase();
942 }
943
944 // Adding the text to the frame. Expeditee text seems to be positioned relative to the baseline of the first line, so
945 // the font size has to be added to the y-position
946 t = frame.addText(Math.round(x), Math.round(y + fontSizeFloat), textContent, null);
947
948 t.setColor(rgbStringToColor(color));
949 t.setBackgroundColor(rgbStringToColor(bgColorString));
950 t.setFont(font.clone());
951 t.setSize(fontSizeFloat);
952 t.setFontStyle(fontStyleComplete);
953 t.setLetterSpacing(letterSpacingFloat);
954
955 // Removing any spacing between lines allowing t.getLineHeight() to be used to get the actual height
956 // of just the characters (i.e. distance from ascenders to descenders)
957 t.setSpacing(0);
958
959 t.setSpacing(lineHeightInt - t.getLineHeight());
960
961 if (align.equals("left")) {
962 t.setJustification(Justification.left);
963 } else if (align.equals("right")) {
964 t.setJustification(Justification.right);
965 } else if (align.equals("center")) {
966 t.setJustification(Justification.center);
967 } else if (align.equals("justify")) {
968 t.setJustification(Justification.full);
969 }
970
971 // Font size is added to the item width to give a little breathing room
972 t.setWidth(Math.round(width + (t.getSize())));
973
974 if (!linkUrl.equals("undefined")) {
975 t.setAction("gotourl " + linkUrl);
976 t.setActionMark(false);
977 }
978
979 } else if (currentNode.getNodeType() == Node.ELEMENT_NODE) {
980
981 // Always returns in format "rgb(x,x,x)" or "rgba(x,x,x,x)"
982 String bgColorString = (String) style.call("getPropertyValue", new Object[] { "background-color" });
983
984 Colour bgColor = rgbStringToColor(bgColorString);
985
986 // If the element has a background color then add it (to Expeditee) as a rectangle with that background color
987 if (bgColor != null) {
988 frame.addRectangle(Math.round(x), Math.round(y), Math.round(width), Math.round(height), 0, null, bgColor);
989 }
990
991 String linkUrl = (String) ((JSObject) currentNode).getMember("href");
992
993 // background image, returns in format "url(protocol://absolute/path/to/img.extension)" for images,
994 // may also return gradients, data, etc. (not handled yet). Only need to add bg image on
995 // 'ELEMENT_NODE' (and not 'TEXT_NODE' otherwise there would be double-ups
996 if (((String) style.call("getPropertyValue", new Object[] { "background-image" })).startsWith("url(")) {
997
998 try {
999 WebParser.addBackgroundImageFromNode(currentNode, style, frame, linkUrl, x, y, width, height);
1000 } catch (MalformedURLException mue) {
1001 // probably a 'data:' url, not supported yet
1002 mue.printStackTrace();
1003 } catch (IOException e) {
1004 // TODO Auto-generated catch block
1005 e.printStackTrace();
1006 }
1007 }
1008
1009 String imgSrc;
1010
1011 if (currentNode.getNodeName().toLowerCase().equals("img") && (imgSrc = ((JSObject) currentNode).getMember("src").toString()) != null) {
1012 try {
1013 WebParser.addImageFromUrl(imgSrc, linkUrl, frame, x, y, (int) width, null, null, null, null, null, 0, 0);
1014 } catch (MalformedURLException mue) {
1015 // probably a 'data:' url, not supported yet
1016 mue.printStackTrace();
1017 } catch (IOException e) {
1018 // TODO Auto-generated catch block
1019 e.printStackTrace();
1020 }
1021 }
1022 }
1023 }
1024
1025 Node childNode = currentNode.getFirstChild();
1026
1027 while (childNode != null) {
1028 addPageToFrame(childNode, window, webEngine, frame);
1029 childNode = childNode.getNextSibling();
1030 }
1031 }
1032 }
1033
1034 private static boolean elementVisible(float x, float y, float width, float height, JSObject style) {
1035 if (width <= 0 || height <= 0 || x + width <= 0 || y + height <= 0 || ((String) style.call("getPropertyValue", new Object[] { "visibility" })).equals("hidden")
1036 || ((String) style.call("getPropertyValue", new Object[] { "display" })).equals("none")) {
1037 return false;
1038 } else {
1039 return true;
1040 }
1041 }
1042
1043 /**
1044 * @param imgSrc
1045 * URL of the image to add
1046 * @param linkUrl
1047 * Absolute URL that the image should link to when clicked
1048 * @param frame
1049 * Frame to add the image to
1050 * @param x
1051 * X-coordinate at which the image should be placed on the frame
1052 * @param y
1053 * Y-coordinate at which the image should be placed on the frame
1054 * @param width
1055 * Width of the image once added to the frame. Negative 1 (-1) will cause the actual width of the image file to be used
1056 *
1057 * @param cropStartX
1058 * X-coordinate at which to start crop, or null for no crop
1059 * @param cropStartY
1060 * Y-coordinate at which to start crop, or null for no crop
1061 * @param cropEndX
1062 * X-coordinate at which to end the crop, or null for no crop
1063 * @param cropEndY
1064 * Y-coordinate at which to end the crop, or null for no crop
1065 *
1066 * @param repeat
1067 * String determining how the image should be tiled/repeated. Valid strings are: <i>no-repeat</i>, <i>repeat-x</i>, or
1068 * <i>repeat-y</i>. All other values (including null) will cause the image to repeat in both directions
1069 *
1070 * @param originXPercent
1071 * Percentage into the image to use as the x coordinate of the image's origin point
1072 * @param originYPercent
1073 * Percentage into the image to use as the y coordinate of the image's origin point
1074 *
1075 * @throws MalformedURLException
1076 * @throws IOException
1077 */
1078 public static Picture getImageFromUrl(String imgSrc, String linkUrl, final Frame frame, float x, float y, int width,
1079 Integer cropStartX, Integer cropStartY, Integer cropEndX, Integer cropEndY, String repeat, float originXPercent, float originYPercent)
1080 throws IOException {
1081
1082 URL imgUrl = new URL(imgSrc);
1083
1084 HttpURLConnection connection = (HttpURLConnection) (imgUrl.openConnection());
1085
1086 Image img = Image.getImage(connection);
1087
1088 int hashcode = img.hashCode();
1089
1090 File out = new File(FrameIO.IMAGES_PATH + Integer.toHexString(hashcode) + ".png");
1091 out.mkdirs();
1092 img.writeToDisk("png", out);
1093
1094 if (repeat == null && cropEndX == null && cropStartX == null && cropEndY == null && cropStartY == null) {
1095 repeat = "no-repeat";
1096 }
1097
1098 if (cropEndX == null || cropStartX == null || cropEndY == null || cropStartY == null) {
1099 cropStartX = 0;
1100 cropStartY = 0;
1101 cropEndX = img.getWidth();
1102 cropEndY = img.getHeight();
1103 } else if (cropStartX < 0) {
1104 cropEndX = cropEndX - cropStartX;
1105 x = x + Math.abs(cropStartX);
1106 cropStartX = 0;
1107 }
1108
1109 if (cropStartY < 0) {
1110 cropEndY = cropEndY - cropStartY;
1111 y = y + Math.abs(cropStartY);
1112 cropStartY = 0;
1113 }
1114
1115 if (width < 0) {
1116 width = img.getWidth();
1117 }
1118
1119 if (repeat != null) {
1120 if (repeat.equals("no-repeat")) {
1121 int tmpCropEndY = (int) (cropStartY + ((float) width / img.getWidth()) * img.getHeight());
1122 int tmpCropEndX = cropStartX + width;
1123
1124 cropEndX = (cropEndX < tmpCropEndX) ? cropEndX : tmpCropEndX;
1125 cropEndY = (cropEndY < tmpCropEndY) ? cropEndY : tmpCropEndY;
1126 } else if (repeat.equals("repeat-x")) {
1127 int tmpCropEndY = (int) (cropStartY + ((float) width / img.getWidth()) * img.getHeight());
1128 cropEndY = (cropEndY < tmpCropEndY) ? cropEndY : tmpCropEndY;
1129 } else if (repeat.equals("repeat-y")) {
1130 int tmpCropEndX = cropStartX + width;
1131 cropEndX = (cropEndX < tmpCropEndX) ? cropEndX : tmpCropEndX;
1132 }
1133 }
1134
1135 if (originXPercent > 0) {
1136 int actualWidth = cropEndX - cropStartX;
1137
1138 int originXPixels = Math.round(originXPercent * actualWidth);
1139
1140 x = x - originXPixels;
1141
1142 cropStartX = (int) (cropStartX + (width - actualWidth) * originXPercent);
1143 cropEndX = (int) (cropEndX + (width - actualWidth) * originXPercent);
1144 }
1145
1146 if (originYPercent > 0) {
1147 int height = (int) ((img.getHeight() / (float) img.getWidth()) * width);
1148 int actualHeight = (cropEndY - cropStartY);
1149 int originYPixels = Math.round(originYPercent * actualHeight);
1150
1151 y = y - originYPixels;
1152
1153 cropStartY = (int) (cropStartY + (height - actualHeight) * originYPercent);
1154 cropEndY = (int) (cropEndY + (height - actualHeight) * originYPercent);
1155 }
1156
1157 Text text = new Text("@i: " + out.getName() + " " + width);
1158 text.setPosition(x, y);
1159
1160 Picture pic = ItemUtils.CreatePicture(text);
1161
1162 float invScale = 1 / pic.getScale();
1163
1164 pic.setCrop((int)(cropStartX * invScale), (int)(cropStartY * invScale), (int)(cropEndX * invScale), (int)(cropEndY * invScale));
1165
1166 if (linkUrl != null && !linkUrl.equals("undefined")) {
1167 pic.setAction("goto " + linkUrl);
1168 pic.setActionMark(false);
1169 }
1170
1171 return pic;
1172 }
1173
1174 private static void addImageFromUrl(String imgSrc, String linkUrl, final Frame frame, float x, float y, int width, Integer cropStartX, Integer cropStartY, Integer cropEndX, Integer cropEndY, String repeat,
1175 float originXPercent, float originYPercent)
1176 throws IOException {
1177 Picture pic = getImageFromUrl(imgSrc, linkUrl, frame, x, y, width, cropStartX, cropStartY, cropEndX, cropEndY, repeat, originXPercent, originYPercent);
1178 frame.addItem(pic);
1179 pic.anchor();
1180 pic.getSource().anchor();
1181 }
1182
1183 public static Picture getBackgroundImageFromNode(Node node, JSObject style, final Frame frame, String linkUrl, float x, float y, float width, float height) throws IOException {
1184
1185
1186 String bgImage = (String) style.call("getPropertyValue", new Object[] { "background-image" });
1187 bgImage = bgImage.substring(4, bgImage.length() - 1);
1188
1189 String bgSize = ((String) style.call("getPropertyValue", new Object[] { "background-size" })).toLowerCase();
1190 String bgRepeat = ((String) style.call("getPropertyValue", new Object[] { "background-repeat" })).toLowerCase();
1191
1192 // Returns "[x]px [y]px", "[x]% [y]%", "[x]px [y]%" or "[x]% [y]px"
1193 String bgPosition = ((String) style.call("getPropertyValue", new Object[] { "background-position" })).toLowerCase();
1194
1195 String[] bgOffsetCoords = bgPosition.split(" ");
1196
1197 int bgOffsetX = 0, bgOffsetY = 0;
1198
1199 float originXPercent = 0, originYPercent = 0;
1200
1201 int cropStartX, cropStartY, cropEndX, cropEndY;
1202
1203 // Converting the x and y offset values to integers (and from % to px if needed)
1204 if (bgOffsetCoords[0].endsWith("%")) {
1205 bgOffsetX = (int) ((Integer.valueOf(bgOffsetCoords[0].substring(0, bgOffsetCoords[0].length() - 1)) / 100.0) * width);
1206 originXPercent = (Integer.valueOf(bgOffsetCoords[0].substring(0, bgOffsetCoords[0].length() - 1))) / 100f;
1207 } else if (bgOffsetCoords[0].endsWith("px")) {
1208 bgOffsetX = (int) (Integer.valueOf(bgOffsetCoords[0].substring(0, bgOffsetCoords[0].length() - 2)));
1209 }
1210
1211 if (bgOffsetCoords[1].endsWith("%")) {
1212 bgOffsetY = (int) ((Integer.valueOf(bgOffsetCoords[1].substring(0, bgOffsetCoords[1].length() - 1)) / 100.0) * height);
1213 originYPercent = (Integer.valueOf(bgOffsetCoords[1].substring(0, bgOffsetCoords[1].length() - 1))) / 100f;
1214 } else if (bgOffsetCoords[1].endsWith("px")) {
1215 bgOffsetY = (int) (Integer.valueOf(bgOffsetCoords[1].substring(0, bgOffsetCoords[1].length() - 2)));
1216 }
1217
1218 // Converting from an offset to crop coords
1219 cropStartX = -1 * bgOffsetX;
1220 cropEndX = (int) (cropStartX + width);
1221
1222 cropStartY = -1 * bgOffsetY;
1223 cropEndY = (int) (cropStartY + height);
1224
1225 int bgWidth = -1;
1226
1227 if (bgSize.equals("cover")) {
1228 bgWidth = (int) width;
1229 } else if (bgSize.equals("contain")) {
1230 // TODO: actually compute the appropriate width
1231 bgWidth = (int) width;
1232 } else if (bgSize.equals("auto")) {
1233 bgWidth = -1;
1234 } else {
1235 bgSize = bgSize.split(" ")[0];
1236
1237 if (bgSize.endsWith("%")) {
1238 bgWidth = (int) ((Integer.parseInt(bgSize.replaceAll("\\D", "")) / 100.0) * width);
1239 } else if (bgSize.endsWith("px")) {
1240 bgWidth = Integer.parseInt(bgSize.replaceAll("\\D", ""));
1241 }
1242 }
1243
1244 return getImageFromUrl(bgImage, linkUrl, frame, x, y, bgWidth, cropStartX, cropStartY, cropEndX, cropEndY, bgRepeat, originXPercent, originYPercent);
1245 }
1246
1247 private static void addBackgroundImageFromNode(Node node, JSObject style, final Frame frame, String linkUrl, float x, float y, float width, float height) throws IOException {
1248 Picture pic = getBackgroundImageFromNode(node, style, frame, linkUrl, x, y, width, height);
1249 frame.addItem(pic);
1250 pic.anchor();
1251 pic.getSource().anchor();
1252 }
1253
1254 /**
1255 * @param rootElement
1256 * Element that will be converted (including all sub-elements)
1257 * @param backgroundColor
1258 * String to be used as the background color of this element when added. In the format "rgb(x,x,x)" or "rgba(x,x,x,x)"
1259 * @param window
1260 * 'window' from Javascript
1261 * @param webEngine
1262 * Web engine that the page is loaded in
1263 * @param frame
1264 * Expeditee frame to add the converted page to
1265 * @throws IllegalArgumentException
1266 * @throws IllegalAccessException
1267 */
1268 private static void addTextToFrame(int visibleWidth, int visibleHeight, JSObject window, WebEngine webEngine, Frame frame) throws InvocationTargetException,
1269 IllegalAccessException, IllegalArgumentException {
1270
1271 webEngine.executeScript("var walker = document.createTreeWalker(document.body, NodeFilter.SHOW_TEXT, null, false);");
1272
1273 Node currentNode;
1274
1275 while ((currentNode = (Node) webEngine.executeScript("walker.nextNode()")) != null) {
1276 JSObject style;
1277 JSObject bounds;
1278
1279 // CSS style for the element
1280 style = (JSObject) window.call("getComputedStyle", new Object[] { currentNode.getParentNode() });
1281
1282 // Getting a rectangle that represents the area and position of the element
1283 bounds = (JSObject) ((JSObject) currentNode.getParentNode()).call("getBoundingClientRect", new Object[] {});
1284
1285 // TODO: This doesn't check if an element or any of its parent elements have position:fixed set - the only way to check seems to
1286 // be to walking through the element's parents until the document root is reached (or a recursive function)
1287 float x = Float.valueOf(bounds.getMember("left").toString());
1288 float y = Float.valueOf(bounds.getMember("top").toString());
1289
1290 float width = Float.valueOf(bounds.getMember("width").toString());
1291 float height = Float.valueOf(bounds.getMember("height").toString());
1292
1293 // Checking if the element is actually visible on the page
1294 if (width > 0 && height > 0 && x + width > 0 && y + height > 0 && x <= visibleWidth && y <= visibleHeight
1295 && !(((String) style.call("getPropertyValue", new Object[] { "display" })).equals("none"))
1296 && !(((String) style.call("getPropertyValue", new Object[] { "visibility" })).equals("hidden"))) {
1297
1298 String fontSize = ((String) style.call("getPropertyValue", new Object[] { "font-size" }));
1299
1300 // Trimming off the units (always px) from the font size
1301 fontSize = fontSize.substring(0, fontSize.length() - 2);
1302
1303 // Always returns in format "rgb(x,x,x)" or "rgba(x,x,x,x)"
1304 String color = (String) style.call("getPropertyValue", new Object[] { "color" });
1305
1306 // Always returns in format "rgb(x,x,x)" or "rgba(x,x,x,x)"
1307 String bgColorString = (String) style.call("getPropertyValue", new Object[] { "background-color" });
1308
1309 String align = (String) style.call("getPropertyValue", new Object[] { "text-align" });
1310
1311 // Returns comma-separated list of typefaces
1312 String typeface = (String) style.call("getPropertyValue", new Object[] { "font-family" });
1313
1314 String[] typefaces = typeface.split(", |,");
1315
1316 String weight = (String) style.call("getPropertyValue", new Object[] { "font-weight" });
1317
1318 String fontStyle = (String) style.call("getPropertyValue", new Object[] { "font-style" });
1319
1320 // Returns "normal" or a value in pixels (e.g. "10px")
1321 String letterSpacing = (String) style.call("getPropertyValue", new Object[] { "letter-spacing" });
1322
1323 // Returns a value in pixels (e.g. "10px")
1324 String lineHeight = (String) style.call("getPropertyValue", new Object[] { "line-height" });
1325
1326 String textTransform = (String) style.call("getPropertyValue", new Object[] { "text-transform" });
1327
1328 String linkUrl = (String) ((JSObject) currentNode.getParentNode()).getMember("href");
1329
1330 Boolean fontFound = false;
1331 Font font = null;
1332
1333 // Looping through all font-families listed in the element's CSS until one that is installed is
1334 // found, or the end of the list is reached, in which case the default font is used
1335 for (int j = 0; j < typefaces.length && !fontFound; j++) {
1336 if (typefaces[j].toLowerCase().equals("sans-serif")) {
1337 typefaces[j] = "SansSerif";
1338 } else if ((typefaces[j].toLowerCase().equals("arial"))) {
1339 // Have to use Arial Unicode, otherwise unicode characters display incorrectly
1340 // It seems that not all systems have this font (including some Windows machines),
1341 // but as long as the website has a general font type specified (e.g. "font-family: Arial, Sans-Serif"),
1342 // there should be no noticeable difference.
1343 typefaces[j] = "Arial Unicode MS";
1344 } else if ((typefaces[j].toLowerCase().equals("monospace"))) {
1345 typefaces[j] = "monospaced";
1346 }
1347
1348 // Regex will remove any inverted commas surrounding multi-word typeface names
1349 String familyName = typefaces[j].replaceAll("^'|'$", "");
1350 font = new Font(familyName);
1351 font.setStyle(Font.Style.PLAIN);
1352 font.setSize(12);
1353
1354 // If the font isn't found, Java just uses Font.DIALOG, so this check checks whether the font was found
1355 if (EcosystemManager.getFontManager().getActualFont(font).getFamilyName().toLowerCase().equals(familyName.toLowerCase())) {
1356 fontFound = true;
1357 }
1358 }
1359
1360 if (!fontFound) {
1361 font = new Font("Times New Roman");
1362 font.setStyle(Font.Style.PLAIN);
1363 font.setSize(12);
1364 }
1365
1366 String fontStyleComplete = "";
1367
1368 int weightInt = 0;
1369
1370 try {
1371 weightInt = Integer.parseInt(weight);
1372 } catch (NumberFormatException nfe) {
1373 // Use default value as set above
1374 }
1375
1376 // checking if font is bold - i.e. 'bold', 'bolder' or weight over 500
1377 if (weight.toLowerCase().startsWith("bold") || weightInt > 500) {
1378 fontStyleComplete = fontStyleComplete.concat("bold");
1379 }
1380
1381 if (fontStyle.toLowerCase().equals("italic") || fontStyle.toLowerCase().equals("oblique")) {
1382 fontStyleComplete = fontStyleComplete.concat("italic");
1383 }
1384
1385 float fontSizeFloat = 12;
1386
1387 try {
1388 fontSizeFloat = Float.valueOf(fontSize);
1389 } catch (NumberFormatException nfe) {
1390 // Use default value as set above
1391 }
1392
1393 float letterSpacingFloat = -0.008f;
1394
1395 try {
1396 letterSpacingFloat = (Integer.parseInt(letterSpacing.substring(0, letterSpacing.length() - 2)) / (fontSizeFloat));
1397 } catch (NumberFormatException nfe) {
1398 // Use default value as set above
1399 }
1400
1401 float lineHeightInt = -1;
1402
1403 try {
1404 lineHeightInt = (Float.parseFloat(lineHeight.substring(0, lineHeight.length() - 2)));
1405 } catch (NumberFormatException nfe) {
1406 // Use default value as set above
1407 }
1408
1409 Text t;
1410
1411 String textContent = currentNode.getTextContent().replaceAll("[^\\S\\n]+", " ");
1412 textContent = textContent.replaceAll("^(\\s)(\\n|\\r)", "");
1413
1414 if (textTransform.equals("uppercase")) {
1415 textContent = textContent.toUpperCase();
1416 } else if (textTransform.equals("lowercase")) {
1417 textContent = textContent.toUpperCase();
1418 }
1419
1420 // Adding the text to the frame. Expeditee text seems to be positioned relative to the baseline of the first line, so
1421 // the font size has to be added to the y-position
1422 t = frame.addText(Math.round(x), Math.round(y + fontSizeFloat), textContent, null);
1423
1424 t.setColor(rgbStringToColor(color));
1425 t.setBackgroundColor(rgbStringToColor(bgColorString));
1426 t.setFont(font.clone());
1427 t.setSize(fontSizeFloat);
1428 t.setFontStyle(fontStyleComplete);
1429 t.setLetterSpacing(letterSpacingFloat);
1430
1431 // Removing any spacing between lines allowing t.getLineHeight() to be used to get the actual height
1432 // of just the characters (i.e. distance from ascenders to descenders)
1433 t.setSpacing(0);
1434
1435 t.setSpacing(lineHeightInt - t.getLineHeight());
1436
1437 if (align.equals("left")) {
1438 t.setJustification(Justification.left);
1439 } else if (align.equals("right")) {
1440 t.setJustification(Justification.right);
1441 } else if (align.equals("center")) {
1442 t.setJustification(Justification.center);
1443 } else if (align.equals("justify")) {
1444 t.setJustification(Justification.full);
1445 }
1446
1447 // Font size is added to the item width to give a little breathing room
1448 t.setWidth(Math.round(width + (t.getSize())));
1449
1450 if (!linkUrl.equals("undefined")) {
1451 t.setAction("createFrameWithBrowser " + linkUrl);
1452 t.setActionMark(false);
1453 }
1454 }
1455 }
1456 }
1457
1458 /**
1459 * Used by the web parser to add Next, Previous, etc. buttons to the converted pages
1460 *
1461 * @param text
1462 * text to display on the button
1463 * @param link
1464 * Frame that the button will link to
1465 * @param action
1466 * Action to run when button is clicked
1467 * @param width
1468 * Width of the button
1469 * @param toAddTo
1470 * Frame to add the button to
1471 * @param anchorTop
1472 * @param anchorRight
1473 * @param anchorBottom
1474 * @param anchorLeft
1475 */
1476 private static void addButton(String text, String link, String action, int width, Frame toAddTo, Integer anchorTop, Integer anchorRight, Integer anchorBottom, Integer anchorLeft) {
1477 // Button to go to the next frame/page
1478 Text button = new Text(text);
1479
1480 button.setLink(link);
1481 button.addAction(action);
1482 button.setBorderColor(new Colour(0.7f, 0.7f, 0.7f));
1483 button.setBackgroundColor(new Colour(0.9f, 0.9f, 0.9f));
1484 button.setThickness(1);
1485 button.setLinkMark(false);
1486 button.setActionMark(false);
1487 button.setFamily("Roboto Condensed Light");
1488 button.setJustification(Justification.center);
1489 button.setWidth(width);
1490
1491 if (anchorTop != null) {
1492 button.setAnchorTop(anchorTop);
1493 }
1494
1495 if (anchorRight != null) {
1496 button.setAnchorRight(anchorRight);
1497 }
1498
1499 if (anchorBottom != null) {
1500 button.setAnchorBottom(anchorBottom);
1501 }
1502
1503 if (anchorLeft != null) {
1504 button.setAnchorLeft(anchorLeft);
1505 }
1506
1507 button.setID(toAddTo.getNextItemID());
1508 toAddTo.addItem(button);
1509
1510 }
1511}
Note: See TracBrowser for help on using the repository browser.