IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
> Gear table extractor/converters, So that we don't have to manually type everything
Blade
post Mar 7 2008, 10:27 PM
Post #1


Runner
******

Group: Members
Posts: 3,009
Joined: 25-September 06
From: Paris, France
Member No.: 9,466



Ok, it's finally done.
You'll find everything here

SR4Light can be used to convert the PDF to HTML files. More info here.
Extractor.pl is used to convert the gear tables in the HTML files to XML files. It works with Augmentation but doesn't work with Arsenal (because most of the gear tables are images rather than text).
Gear-1.xml is the XML file extracted from Augmentation.
XMLToDae is used to convert the XML files into Daegann's Character Generator's .Dae files.

To use XMLToDae you'll need a Java Virtual Machine on your computer, chances are you already have one. If you can run the XMLToDae.jar file, you have one.

Once inside the application, choose File->Open and choose the XML file you want to convert.
A popup window should appear. It appears each time there is a new category of item (GlobalType in the XML file) to ask you what kind of items are in this category in order to export the items to the right .dae file. For example in Augmentation, the first one is Cyberware, so you'll need to choose Cyberware.

You can choose to Export or Skip this category.

If you choose Export, you'll be back in the main window, where the details of each item is displayed. You can freely modify the fields. Once you're done press "Export Item" to export the item or "Skip item" to ignore it. Once you've exported (or skipped) all the items of a category, the data is saved in the .dae file according to your choice.
If you already have a .dae file of the same name in the directory where XMLToDae.jar is located, the item will be appended at the end of the file, if not a new file will be created.
If you quit before finishing a category, nothing will be saved.
Known bug: when starting a new category, the program will show you all the items you've already exported before the items of the new category. So right now the only way to do it is to convert a category, then close the program, start it again and skip the category you've already converted. I'll try to fix that if it's not too hard.

All items inside the same "GlobalType" in the XML file will be exported to the same .dae file. If you want some items to be exported to another .dae file, you'll have to move them inside the XML file.

I guess that covers it. If you've got any question feel free to ask. The source code is included in the zip file if anyone's interested. Feel free to do whatever you want with it.

Original post:

QUOTE
Recently I wanted to create a new character using Daegann's character generator, but the generator lacked Augmentation and Arsenal's gear.
I remember spending a lot of time manually adding most of SR4's BBB gear and I didn't want to do it again.
Then I realized that my SR PDF to HTML converter I could access the tables data in a format I could automatically parse to extract the data.

So I decided to program a little script to convert the HTML pages with gear tables into XML files with all the gear data neatly stored.
Looks like it's working well. So far, I've got it working with Augmentation's tables. Arsenal is a little bit more of a problem, as nearly 2/3 of the text in the tables can't be recovered. I'll try to ask Adam about it.

Some of the data will still need some manual processing: for example the items where the cost field is "rating x n:nuyen:" and the data doesn't include the item description (I can probably manage to add it though), but it's still far less work for those who want to add the gear to their character generators (or other SR4 program).

If anyone is interested, I'll upload it (I just have a few things to fix first). Next step will be to program a .XML<->.DAE (daegann's generator data files) converter, which shouldn't be too hard to pull.
Go to the top of the page
 
+Quote Post
Cadmus
post Mar 8 2008, 07:18 AM
Post #2


Moving Target
**

Group: Members
Posts: 325
Joined: 24-February 06
From: Kansas
Member No.: 8,304



sounds cool
Go to the top of the page
 
+Quote Post
Blade
post Mar 8 2008, 02:13 PM
Post #3


Runner
******

Group: Members
Posts: 3,009
Joined: 25-September 06
From: Paris, France
Member No.: 9,466



Ok, the script is finished.
It works with Augmentation after a few adjustments to the HTML files.

So I've got a nice XML file, nearly 3000 lines long, with all the data from Augmentation's gear tables inside. I don't know if I'm allowed to post it here.
Go to the top of the page
 
+Quote Post
sloejack
post Mar 8 2008, 02:31 PM
Post #4


Target
*

Group: Members
Posts: 17
Joined: 3-December 07
From: Chicago
Member No.: 14,474



QUOTE (Blade @ Mar 8 2008, 10:13 AM) *
Ok, the script is finished.
It works with Augmentation after a few adjustments to the HTML files.

So I've got a nice XML file, nearly 3000 lines long, with all the data from Augmentation's gear tables inside. I don't know if I'm allowed to post it here.


Would you be willing to share your script?
Go to the top of the page
 
+Quote Post
Blade
post Mar 8 2008, 03:45 PM
Post #5


Runner
******

Group: Members
Posts: 3,009
Joined: 25-September 06
From: Paris, France
Member No.: 9,466



Sure, it's a perl script, meant to be applied on the HTML files you get from SR4Light.
You'll need to manually edit the html files first to fix some problems, essentially troubles with columns alignment.
For example you'll need to change lines like this:
CODE
</span></nobr></DIV><DIV style="position:absolute;top:706;left:522;"<nobr><span class="ft5">(Rating x 6)FRating x 30,000Â¥

to
CODE
</span></nobr></DIV><DIV style="position:absolute;top:706;left:522;"<nobr><span class="ft5">(Rating x 6)F
</span></nobr></DIV><DIV style="position:absolute;top:706;left:670;"<nobr><span class="ft5">Rating x 30,000Â¥


(changing the "left" attribute shouldn't be necessary in most cases)

The script also doesn't support the case where an item has base attributes and "sub-items" with modifiers. For instance, you have to manually remove the base attributes of the altskin and insert them into ech of the sub-items.

You might also need to remove some lines which aren't really useful and can mess up with the rest.

The best way to fix your html file is just to run the script, look at the xml output file for problems, find what causes this problem in the html and fix it.

Here is the script:

CODE
#TODO: Handle the case where an item has both base attributes and subitems
#        Replace baaaad global variables with good local variables.

my $intype = 0;
my $incategory = 0;
my $begintype = 0;
my $page = 167;
my $calcul = 0;
$numattributes = -1;
$currentattribute = -1;

    #For getting the value of a rated attribute
sub getRating {
    my $rating = $_[0];
    my $Formula =  $_[1];
    my $type = 0;
    
    if ($Formula !~ /Rating/i) {
        return $Formula;
    }
    
        # Yes, I know, doing this each time for the same formula isn't an optimized way to do it. I don't care.
    if ($Formula =~ /Â¥/) {
        $type=1;
    }
    elsif ($Formula =~ /\[(.+)\]/) {
        $type=2;
    }
    elsif ($Formula =~ /\(.*\)(R|F|-)/) {
        $type=$1;
    }

    $Formula =~ s/Â¥//;
    $Formula =~ s/\(//;
    $Formula =~ s/\)//;
    $Formula =~ s/,//;
    $Formula =~ s/ //;
    $Formula =~ s/Rating//;
    $Formula =~ s/\[//;
    $Formula =~ s/\]//;
    if ($Formula =~ /x/) {
        $Formula =~ s/x//;
        $value = $rating * $Formula;
    }
    elsif ($Formula =~ /\+/) {
        $Formula =~ s/\+//;
        $value = $rating + $Formula;
    }
    else {
        $value = $rating;
    }
    if ($type==0) {    return $value; }
    elsif ($type==1) { return $value."Â¥"; }
    elsif ($type==2) { return "\[".$value."\]"; }
    else { return "(".$value.")".$type; }
}

    #For creating one entry per rating of an item
sub exportRating() {

    if ($initem == 1) {
        $itemname =~ /Rating.*([0-9]+).([0-9]+)/;
        $minrating = $1;
        $maxrating = $2;
        $itemname =~ s/\(Rating.*//;
    }
    else {
        $subitemname =~ /Rating.*([0-9]+).([0-9]+)/;
        $minrating = $1;
        $maxrating = $2;
        $subitemname =~ s/\(Rating.*//;
    }

    for ($i=$minrating;$i<=$maxrating;$i++) {
        if ($initem == 1) {
            print FO "<item name=\"".$itemname."(Rating ".$i.")\">\n";
        }
        else {
            print FO "<subitem name=\"".$subitemname."(Rating ".$i.")\">\n";
        }
        for ($j=0;$j<$numattributes;$j++) {
            print FO "<attribute name=\"".$Attribute[$j]."\">".&getRating($i, $attributevalue[$j])."<\/attribute>\n";
        }
        if ($initem == 1) {
            print FO "<\/item>\n";
        }
        else {
            print FO "<\/subitem>\n";
        }
    }
    $hasRating = 0;
}

open (FI, "<" . "dAug-167.html");
open (FO, ">" . "Gear-1.xml");

while ($calcul !~ /y/i && $calcul !~ /n/i) {
    print "Do you want rating calculations? (Y/N)";
    $calcul = <STDIN>
}

print FO "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\n";
print FO "<Book name=\"Augmentation\">";

do {
    $ligne = <FI>;
        while ($ligne) {
            if($ligne =~ /^<\/span>/) {
                chomp($ligne);

                    #Fix for the last item of each page
                $ligne =~ s/<\/DIV>$//;
                    
                    # Fix for Arsenal Pages 168 to 170 (included) , 174, 175, 177 and 178
                if (($page > 167 && $page < 171) || $page == 174 || $page == 175 || $page == 177 || $page == 178) {
                    $ligne =~ /"ft(.)/;
                    $mod = $1+1;
                        # Fix for new types in these pages
                    if ($mod == 6) {
                        $mod = 3;
                    }
                    $ligne =~ s/"ft./"ft$mod/;
                }
                    # Fix for Arsenal Pages 171 and 172
                elsif ($page == 171 || $page == 172) {
                    $ligne =~ s/.*"ft3.*//;
                }
                    
                    # Global type (cyberware, bioware...)
                if ($ligne =~ /ft3/) {
                    $typename = substr($ligne, rindex($ligne,">")+1);
                    if ($begintype != 0) {
                        if ($initem == 2) {
                            if ($hasRating == 1) {
                                &exportRating();
                            }
                            else {
                                print FO "<\/subitem>\n";
                            }
                            $initem = 0;
                        }
                        if ($hasRating == 1) {
                                &exportRating();
                            }
                            else {
                                print FO "<\/item>\n";
                            }
                        print FO "<\/category> \n";
                        print FO "<\/GlobalType> \n";
                    }
                    else {
                        $begintype = 1;
                    }
                    print FO "<GlobalType name=\"".$typename."\"> \n";
                    $intype=1;
                }
                
                    # Category (headware, bodyware...) or Attributes
                elsif ($ligne =~ /ft4/) {
                        # Category
                    if ($incategory==0) {
                        $categoryname = substr($ligne, rindex($ligne,">")+1);
                        if ($intype==1) {
                            $intype=0;
                        }
                        else {
                            if ($initem == 2) {
                                if ($hasRating == 1) {
                                    &exportRating();
                                }
                                else {
                                    print FO "<\/subitem>\n";
                                }
                                $initem = 0;
                            }
                            if ($hasRating == 1) {
                                &exportRating();
                            }
                            else {
                                print FO "<\/item>\n";
                            }
                            print FO "<\/category> \n";
                        }
                        print FO "<category name=\"".$categoryname."\"> \n";
                        $incategory=1;
                        $currentattribute = 0;
                        $numattributes=-1;
                    }
                        # Attributes
                    else {
                        $Attribute[$currentattribute] = substr($ligne, rindex($ligne,">")+1);
                        $Attribute[$currentattribute] =~ s/ /_/g;
                        $currentattribute++;
                    }
                }
                    # Items and Attributes
                elsif ($ligne =~ /ft5/) {
                
                        #Check if we're out of an item with subitem
                    if ($initem == 2) {
                        $ligne =~ /left:(.*);/;
                        if ($left == $1) {
                            if ($hasRating == 1) {
                                &exportRating();
                            }
                            else {
                                print FO "<\/subitem>\n";
                            }
                            $initem = 0;
                        }
                    }
                
                        # New item
                    if ($initem != 2 && ($currentattribute == -1 || $numattributes == -1)) {
                            #First Item
                        if ($incategory==1) {
                            $incategory = 0;
                            $numattributes = $currentattribute;
                            $currentattribute=-1;
                            $hasRating = 0;
                        }
                        else {
                            if ($hasRating == 1) {
                                &exportRating();
                            }
                            else {
                                print FO "<\/item>\n";
                            }
                        }
                        $ligne =~ /left:(.*);/;
                        $left = $1;
                        $itemname = substr($ligne, rindex($ligne,">")+1);
                        $initem = 1;
                        if ($itemname =~ /\(Rating/ && $calcul =~ /y/i) {
                            $hasRating = 1;
                        }
                        else {
                            $hasRating = 0;
                            print FO "<item name=\"".$itemname."\"> \n";
                        }
                    }
                        # Attribute
                    else {
                            #Check if it's an attribute or a subclass of the item
                        $ligne =~ /left:(.*);/;
                            #Subclass
                        if ($1 < $left+30) {
                            if ($initem == 1) {
                                $initem = 2;
                            }
                            else {
                                if ($hasRating == 1) {
                                    &exportRating();
                                }
                                else {
                                    print FO "<\/subitem>\n";
                                }
                            }
                            $subitemname = substr($ligne, rindex($ligne,">")+1);
                            if ($subitemname =~ /\(Rating/ && $calcul =~ /y/i) {
                                $hasRating = 1;
                            }
                            else {
                                $hasRating = 0;
                                print FO "<subitem name=\"".$subitemname."\">\n";
                            }
                            $currentattribute = -1;
                        }
                            #Attribute
                        else {
                            $attributevalue[$currentattribute] = substr($ligne, rindex($ligne,">")+1);
                            if ($hasRating == 0) {
                                print FO "<attribute name=\"".$Attribute[$currentattribute]."\">".$attributevalue[$currentattribute]."<\/attribute>\n";
                            }
                        }
                    }
                    $currentattribute++;
                    
                    if ($currentattribute >= $numattributes) { $currentattribute = -1; }
                }
                
            }

            $ligne = <FI>;
        }
        $page++;
        #$initem = 0;
        #$incategory = 0;
        
    } while (open(FI,"<"."dAug-".$page.".html"));
    print FO "<\/item>\n";
    print FO "<\/category>\n";
    print FO "<\/GlobalType>\n";
    print FO "</Book>";
Go to the top of the page
 
+Quote Post
Blade
post Mar 13 2008, 09:54 PM
Post #6


Runner
******

Group: Members
Posts: 3,009
Joined: 25-September 06
From: Paris, France
Member No.: 9,466



Updated ! (Post above has been updated with latest script)

* Now use a generic "attribute" tag instead of a tag per attribute, so that a schema can be easily set (and maybe easier parsing with some languages/librairies).
* Option to export items/subitems with Rating into different items, with automatic attribute calculation!
* Known minor bug: you'll need to correct the "Retinal Adjusters" entry to get a correct XML file. Just replace <subitem with <item and remove the </subitem after the entry.

---

To do/status:
* I've reverse engineered Daegann's Character Generator's .dae files, so I should be able to export all Augmentation's gear into Daegann's Character Generator.
* Waiting for Adam's answer for Arsenal's tables and public distribution of the XML files.
Go to the top of the page
 
+Quote Post
Blade
post Mar 24 2008, 05:13 PM
Post #7


Runner
******

Group: Members
Posts: 3,009
Joined: 25-September 06
From: Paris, France
Member No.: 9,466



Updated ! (First post has been updated)

* XML To Dae converter done.
Go to the top of the page
 
+Quote Post
Dumori
post Apr 1 2008, 10:03 PM
Post #8


Dumorimasoddaa
******

Group: Members
Posts: 2,687
Joined: 30-March 08
Member No.: 15,830



If you've made the full updated Aug and Arsenal .dat files it would be a good Idea to post a link to them in the DnCrg SR4 Character Generator (Early Dev) thread and/or in this one to save people the time of doing whats already been done.
Go to the top of the page
 
+Quote Post
Blade
post Apr 2 2008, 08:37 AM
Post #9


Runner
******

Group: Members
Posts: 3,009
Joined: 25-September 06
From: Paris, France
Member No.: 9,466



I don't have a xml to dat converter. Right now, it's only for Daegann's chargen.
I also don't have anything for Arsenal.
Go to the top of the page
 
+Quote Post
Dumori
post Apr 2 2008, 11:45 AM
Post #10


Dumorimasoddaa
******

Group: Members
Posts: 2,687
Joined: 30-March 08
Member No.: 15,830



that's what I meant. It would still be useful posting the aug updated files.
Go to the top of the page
 
+Quote Post
Blade
post Apr 2 2008, 12:31 PM
Post #11


Runner
******

Group: Members
Posts: 3,009
Joined: 25-September 06
From: Paris, France
Member No.: 9,466



There's no Aug updated files for DnCrg, it's for Daegann's Character Generator which is another generator...
And I don't even have the file, just the converter.
Go to the top of the page
 
+Quote Post
Dumori
post Apr 6 2008, 02:01 AM
Post #12


Dumorimasoddaa
******

Group: Members
Posts: 2,687
Joined: 30-March 08
Member No.: 15,830



Oh by the way if you run the text reconisation tool in adobe acrobat 8 it will turn the text on the images in to proper text not done of on the full arsenal document but it looks like it could work.
Go to the top of the page
 
+Quote Post
Blade
post Apr 7 2008, 03:38 PM
Post #13


Runner
******

Group: Members
Posts: 3,009
Joined: 25-September 06
From: Paris, France
Member No.: 9,466



Is it available in Reader?
Go to the top of the page
 
+Quote Post
Dumori
post Apr 7 2008, 11:18 PM
Post #14


Dumorimasoddaa
******

Group: Members
Posts: 2,687
Joined: 30-March 08
Member No.: 15,830



I have no idea ive pro if you send my scrpits for arsenal I could make the XML but I can't get my sr4 lite to work
Go to the top of the page
 
+Quote Post
Blade
post Jun 10 2008, 08:39 PM
Post #15


Runner
******

Group: Members
Posts: 3,009
Joined: 25-September 06
From: Paris, France
Member No.: 9,466



Updated!

* Crude .Dae files with Augmentation's cyberware, bioware and other equipments are now available. There are just temporary fix until someone does a more serious conversion work. Nanotech isn't in yet (I'll work on it) and there are no descriptions.
* Corrected a few bugs in XMLToDae: it's now possible to open a xml file that's not in the same directory as the program and a bug with legality codes has been fixed. It's still not perfect though, check the "known bugs" in the first post for more information.
Go to the top of the page
 
+Quote Post

Reply to this topicStart new topic

 



RSS Lo-Fi Version Time is now: 19th April 2024 - 06:33 PM

Topps, Inc has sole ownership of the names, logo, artwork, marks, photographs, sounds, audio, video and/or any proprietary material used in connection with the game Shadowrun. Topps, Inc has granted permission to the Dumpshock Forums to use such names, logos, artwork, marks and/or any proprietary materials for promotional and informational purposes on its website but does not endorse, and is not affiliated with the Dumpshock Forums in any official capacity whatsoever.