Gear table extractor/converters, So that we don't have to manually type everything |
Gear table extractor/converters, So that we don't have to manually type everything |
Mar 7 2008, 10:27 PM
Post
#1
|
|
Runner Group: Members Posts: 3,009 Joined: 25-September 06 From: Paris, France Member No.: 9,466 |
Ok, it's finally done.
You'll find everything here SR4Light can be used to convert the PDF to HTML files. More info here. Extractor.pl is used to convert the gear tables in the HTML files to XML files. It works with Augmentation but doesn't work with Arsenal (because most of the gear tables are images rather than text). Gear-1.xml is the XML file extracted from Augmentation. XMLToDae is used to convert the XML files into Daegann's Character Generator's .Dae files. To use XMLToDae you'll need a Java Virtual Machine on your computer, chances are you already have one. If you can run the XMLToDae.jar file, you have one. Once inside the application, choose File->Open and choose the XML file you want to convert. A popup window should appear. It appears each time there is a new category of item (GlobalType in the XML file) to ask you what kind of items are in this category in order to export the items to the right .dae file. For example in Augmentation, the first one is Cyberware, so you'll need to choose Cyberware. You can choose to Export or Skip this category. If you choose Export, you'll be back in the main window, where the details of each item is displayed. You can freely modify the fields. Once you're done press "Export Item" to export the item or "Skip item" to ignore it. Once you've exported (or skipped) all the items of a category, the data is saved in the .dae file according to your choice. If you already have a .dae file of the same name in the directory where XMLToDae.jar is located, the item will be appended at the end of the file, if not a new file will be created. If you quit before finishing a category, nothing will be saved. Known bug: when starting a new category, the program will show you all the items you've already exported before the items of the new category. So right now the only way to do it is to convert a category, then close the program, start it again and skip the category you've already converted. I'll try to fix that if it's not too hard. All items inside the same "GlobalType" in the XML file will be exported to the same .dae file. If you want some items to be exported to another .dae file, you'll have to move them inside the XML file. I guess that covers it. If you've got any question feel free to ask. The source code is included in the zip file if anyone's interested. Feel free to do whatever you want with it. Original post: QUOTE Recently I wanted to create a new character using Daegann's character generator, but the generator lacked Augmentation and Arsenal's gear.
I remember spending a lot of time manually adding most of SR4's BBB gear and I didn't want to do it again. Then I realized that my SR PDF to HTML converter I could access the tables data in a format I could automatically parse to extract the data. So I decided to program a little script to convert the HTML pages with gear tables into XML files with all the gear data neatly stored. Looks like it's working well. So far, I've got it working with Augmentation's tables. Arsenal is a little bit more of a problem, as nearly 2/3 of the text in the tables can't be recovered. I'll try to ask Adam about it. Some of the data will still need some manual processing: for example the items where the cost field is "rating x n:nuyen:" and the data doesn't include the item description (I can probably manage to add it though), but it's still far less work for those who want to add the gear to their character generators (or other SR4 program). If anyone is interested, I'll upload it (I just have a few things to fix first). Next step will be to program a .XML<->.DAE (daegann's generator data files) converter, which shouldn't be too hard to pull. |
|
|
Mar 8 2008, 07:18 AM
Post
#2
|
|
Moving Target Group: Members Posts: 325 Joined: 24-February 06 From: Kansas Member No.: 8,304 |
sounds cool
|
|
|
Mar 8 2008, 02:13 PM
Post
#3
|
|
Runner Group: Members Posts: 3,009 Joined: 25-September 06 From: Paris, France Member No.: 9,466 |
Ok, the script is finished.
It works with Augmentation after a few adjustments to the HTML files. So I've got a nice XML file, nearly 3000 lines long, with all the data from Augmentation's gear tables inside. I don't know if I'm allowed to post it here. |
|
|
Mar 8 2008, 02:31 PM
Post
#4
|
|
Target Group: Members Posts: 17 Joined: 3-December 07 From: Chicago Member No.: 14,474 |
Ok, the script is finished. It works with Augmentation after a few adjustments to the HTML files. So I've got a nice XML file, nearly 3000 lines long, with all the data from Augmentation's gear tables inside. I don't know if I'm allowed to post it here. Would you be willing to share your script? |
|
|
Mar 8 2008, 03:45 PM
Post
#5
|
|
Runner Group: Members Posts: 3,009 Joined: 25-September 06 From: Paris, France Member No.: 9,466 |
Sure, it's a perl script, meant to be applied on the HTML files you get from SR4Light.
You'll need to manually edit the html files first to fix some problems, essentially troubles with columns alignment. For example you'll need to change lines like this: CODE </span></nobr></DIV><DIV style="position:absolute;top:706;left:522;"<nobr><span class="ft5">(Rating x 6)FRating x 30,000Â¥ to CODE </span></nobr></DIV><DIV style="position:absolute;top:706;left:522;"<nobr><span class="ft5">(Rating x 6)F </span></nobr></DIV><DIV style="position:absolute;top:706;left:670;"<nobr><span class="ft5">Rating x 30,000Â¥ (changing the "left" attribute shouldn't be necessary in most cases) The script also doesn't support the case where an item has base attributes and "sub-items" with modifiers. For instance, you have to manually remove the base attributes of the altskin and insert them into ech of the sub-items. You might also need to remove some lines which aren't really useful and can mess up with the rest. The best way to fix your html file is just to run the script, look at the xml output file for problems, find what causes this problem in the html and fix it. Here is the script: CODE #TODO: Handle the case where an item has both base attributes and subitems
# Replace baaaad global variables with good local variables. my $intype = 0; my $incategory = 0; my $begintype = 0; my $page = 167; my $calcul = 0; $numattributes = -1; $currentattribute = -1; #For getting the value of a rated attribute sub getRating { my $rating = $_[0]; my $Formula = $_[1]; my $type = 0; if ($Formula !~ /Rating/i) { return $Formula; } # Yes, I know, doing this each time for the same formula isn't an optimized way to do it. I don't care. if ($Formula =~ /Â¥/) { $type=1; } elsif ($Formula =~ /\[(.+)\]/) { $type=2; } elsif ($Formula =~ /\(.*\)(R|F|-)/) { $type=$1; } $Formula =~ s/Â¥//; $Formula =~ s/\(//; $Formula =~ s/\)//; $Formula =~ s/,//; $Formula =~ s/ //; $Formula =~ s/Rating//; $Formula =~ s/\[//; $Formula =~ s/\]//; if ($Formula =~ /x/) { $Formula =~ s/x//; $value = $rating * $Formula; } elsif ($Formula =~ /\+/) { $Formula =~ s/\+//; $value = $rating + $Formula; } else { $value = $rating; } if ($type==0) { return $value; } elsif ($type==1) { return $value."Â¥"; } elsif ($type==2) { return "\[".$value."\]"; } else { return "(".$value.")".$type; } } #For creating one entry per rating of an item sub exportRating() { if ($initem == 1) { $itemname =~ /Rating.*([0-9]+).([0-9]+)/; $minrating = $1; $maxrating = $2; $itemname =~ s/\(Rating.*//; } else { $subitemname =~ /Rating.*([0-9]+).([0-9]+)/; $minrating = $1; $maxrating = $2; $subitemname =~ s/\(Rating.*//; } for ($i=$minrating;$i<=$maxrating;$i++) { if ($initem == 1) { print FO "<item name=\"".$itemname."(Rating ".$i.")\">\n"; } else { print FO "<subitem name=\"".$subitemname."(Rating ".$i.")\">\n"; } for ($j=0;$j<$numattributes;$j++) { print FO "<attribute name=\"".$Attribute[$j]."\">".&getRating($i, $attributevalue[$j])."<\/attribute>\n"; } if ($initem == 1) { print FO "<\/item>\n"; } else { print FO "<\/subitem>\n"; } } $hasRating = 0; } open (FI, "<" . "dAug-167.html"); open (FO, ">" . "Gear-1.xml"); while ($calcul !~ /y/i && $calcul !~ /n/i) { print "Do you want rating calculations? (Y/N)"; $calcul = <STDIN> } print FO "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\n"; print FO "<Book name=\"Augmentation\">"; do { $ligne = <FI>; while ($ligne) { if($ligne =~ /^<\/span>/) { chomp($ligne); #Fix for the last item of each page $ligne =~ s/<\/DIV>$//; # Fix for Arsenal Pages 168 to 170 (included) , 174, 175, 177 and 178 if (($page > 167 && $page < 171) || $page == 174 || $page == 175 || $page == 177 || $page == 178) { $ligne =~ /"ft(.)/; $mod = $1+1; # Fix for new types in these pages if ($mod == 6) { $mod = 3; } $ligne =~ s/"ft./"ft$mod/; } # Fix for Arsenal Pages 171 and 172 elsif ($page == 171 || $page == 172) { $ligne =~ s/.*"ft3.*//; } # Global type (cyberware, bioware...) if ($ligne =~ /ft3/) { $typename = substr($ligne, rindex($ligne,">")+1); if ($begintype != 0) { if ($initem == 2) { if ($hasRating == 1) { &exportRating(); } else { print FO "<\/subitem>\n"; } $initem = 0; } if ($hasRating == 1) { &exportRating(); } else { print FO "<\/item>\n"; } print FO "<\/category> \n"; print FO "<\/GlobalType> \n"; } else { $begintype = 1; } print FO "<GlobalType name=\"".$typename."\"> \n"; $intype=1; } # Category (headware, bodyware...) or Attributes elsif ($ligne =~ /ft4/) { # Category if ($incategory==0) { $categoryname = substr($ligne, rindex($ligne,">")+1); if ($intype==1) { $intype=0; } else { if ($initem == 2) { if ($hasRating == 1) { &exportRating(); } else { print FO "<\/subitem>\n"; } $initem = 0; } if ($hasRating == 1) { &exportRating(); } else { print FO "<\/item>\n"; } print FO "<\/category> \n"; } print FO "<category name=\"".$categoryname."\"> \n"; $incategory=1; $currentattribute = 0; $numattributes=-1; } # Attributes else { $Attribute[$currentattribute] = substr($ligne, rindex($ligne,">")+1); $Attribute[$currentattribute] =~ s/ /_/g; $currentattribute++; } } # Items and Attributes elsif ($ligne =~ /ft5/) { #Check if we're out of an item with subitem if ($initem == 2) { $ligne =~ /left:(.*);/; if ($left == $1) { if ($hasRating == 1) { &exportRating(); } else { print FO "<\/subitem>\n"; } $initem = 0; } } # New item if ($initem != 2 && ($currentattribute == -1 || $numattributes == -1)) { #First Item if ($incategory==1) { $incategory = 0; $numattributes = $currentattribute; $currentattribute=-1; $hasRating = 0; } else { if ($hasRating == 1) { &exportRating(); } else { print FO "<\/item>\n"; } } $ligne =~ /left:(.*);/; $left = $1; $itemname = substr($ligne, rindex($ligne,">")+1); $initem = 1; if ($itemname =~ /\(Rating/ && $calcul =~ /y/i) { $hasRating = 1; } else { $hasRating = 0; print FO "<item name=\"".$itemname."\"> \n"; } } # Attribute else { #Check if it's an attribute or a subclass of the item $ligne =~ /left:(.*);/; #Subclass if ($1 < $left+30) { if ($initem == 1) { $initem = 2; } else { if ($hasRating == 1) { &exportRating(); } else { print FO "<\/subitem>\n"; } } $subitemname = substr($ligne, rindex($ligne,">")+1); if ($subitemname =~ /\(Rating/ && $calcul =~ /y/i) { $hasRating = 1; } else { $hasRating = 0; print FO "<subitem name=\"".$subitemname."\">\n"; } $currentattribute = -1; } #Attribute else { $attributevalue[$currentattribute] = substr($ligne, rindex($ligne,">")+1); if ($hasRating == 0) { print FO "<attribute name=\"".$Attribute[$currentattribute]."\">".$attributevalue[$currentattribute]."<\/attribute>\n"; } } } $currentattribute++; if ($currentattribute >= $numattributes) { $currentattribute = -1; } } } $ligne = <FI>; } $page++; #$initem = 0; #$incategory = 0; } while (open(FI,"<"."dAug-".$page.".html")); print FO "<\/item>\n"; print FO "<\/category>\n"; print FO "<\/GlobalType>\n"; print FO "</Book>"; |
|
|
Mar 13 2008, 09:54 PM
Post
#6
|
|
Runner Group: Members Posts: 3,009 Joined: 25-September 06 From: Paris, France Member No.: 9,466 |
Updated ! (Post above has been updated with latest script)
* Now use a generic "attribute" tag instead of a tag per attribute, so that a schema can be easily set (and maybe easier parsing with some languages/librairies). * Option to export items/subitems with Rating into different items, with automatic attribute calculation! * Known minor bug: you'll need to correct the "Retinal Adjusters" entry to get a correct XML file. Just replace <subitem with <item and remove the </subitem after the entry. --- To do/status: * I've reverse engineered Daegann's Character Generator's .dae files, so I should be able to export all Augmentation's gear into Daegann's Character Generator. * Waiting for Adam's answer for Arsenal's tables and public distribution of the XML files. |
|
|
Mar 24 2008, 05:13 PM
Post
#7
|
|
Runner Group: Members Posts: 3,009 Joined: 25-September 06 From: Paris, France Member No.: 9,466 |
Updated ! (First post has been updated)
* XML To Dae converter done. |
|
|
Apr 1 2008, 10:03 PM
Post
#8
|
|
Dumorimasoddaa Group: Members Posts: 2,687 Joined: 30-March 08 Member No.: 15,830 |
If you've made the full updated Aug and Arsenal .dat files it would be a good Idea to post a link to them in the DnCrg SR4 Character Generator (Early Dev) thread and/or in this one to save people the time of doing whats already been done.
|
|
|
Apr 2 2008, 08:37 AM
Post
#9
|
|
Runner Group: Members Posts: 3,009 Joined: 25-September 06 From: Paris, France Member No.: 9,466 |
I don't have a xml to dat converter. Right now, it's only for Daegann's chargen.
I also don't have anything for Arsenal. |
|
|
Apr 2 2008, 11:45 AM
Post
#10
|
|
Dumorimasoddaa Group: Members Posts: 2,687 Joined: 30-March 08 Member No.: 15,830 |
that's what I meant. It would still be useful posting the aug updated files.
|
|
|
Apr 2 2008, 12:31 PM
Post
#11
|
|
Runner Group: Members Posts: 3,009 Joined: 25-September 06 From: Paris, France Member No.: 9,466 |
There's no Aug updated files for DnCrg, it's for Daegann's Character Generator which is another generator...
And I don't even have the file, just the converter. |
|
|
Apr 6 2008, 02:01 AM
Post
#12
|
|
Dumorimasoddaa Group: Members Posts: 2,687 Joined: 30-March 08 Member No.: 15,830 |
Oh by the way if you run the text reconisation tool in adobe acrobat 8 it will turn the text on the images in to proper text not done of on the full arsenal document but it looks like it could work.
|
|
|
Apr 7 2008, 03:38 PM
Post
#13
|
|
Runner Group: Members Posts: 3,009 Joined: 25-September 06 From: Paris, France Member No.: 9,466 |
Is it available in Reader?
|
|
|
Apr 7 2008, 11:18 PM
Post
#14
|
|
Dumorimasoddaa Group: Members Posts: 2,687 Joined: 30-March 08 Member No.: 15,830 |
I have no idea ive pro if you send my scrpits for arsenal I could make the XML but I can't get my sr4 lite to work
|
|
|
Jun 10 2008, 08:39 PM
Post
#15
|
|
Runner Group: Members Posts: 3,009 Joined: 25-September 06 From: Paris, France Member No.: 9,466 |
Updated!
* Crude .Dae files with Augmentation's cyberware, bioware and other equipments are now available. There are just temporary fix until someone does a more serious conversion work. Nanotech isn't in yet (I'll work on it) and there are no descriptions. * Corrected a few bugs in XMLToDae: it's now possible to open a xml file that's not in the same directory as the program and a bug with legality codes has been fixed. It's still not perfect though, check the "known bugs" in the first post for more information. |
|
|
Lo-Fi Version | Time is now: 19th April 2024 - 06:33 PM |
Topps, Inc has sole ownership of the names, logo, artwork, marks, photographs, sounds, audio, video and/or any proprietary material used in connection with the game Shadowrun. Topps, Inc has granted permission to the Dumpshock Forums to use such names, logos, artwork, marks and/or any proprietary materials for promotional and informational purposes on its website but does not endorse, and is not affiliated with the Dumpshock Forums in any official capacity whatsoever.