Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
Spewing to dead tree format...

By kevin lyda in News
Thu Jun 29, 2000 at 12:11:59 PM EST
Tags: Technology (all tags)
Technology

We've got a fair number of Linux servers running here and they are responsible for a fair amount of income for the company. I've recently done a major rewrite that solves a number of problems, but they're all under the hood changes. Now I'm working on the paint job - invoices and reports. What UNIX tools can I use to generate stuff that will really impress the suits?


Let's face it, smaller transaction times and various other changes aren't really going to make anyone non-technical excited, but if I can print a logo on an invoice they'll get all gooey. Currently I'm looking at the PostScript modules on CPAN, but i can't seem to get both text and graphics. I'm not married to perl, I just need to access a MySQL data source and spit out postscript - what are my options in between there? For developers keen on getting free software in the corporate enterprise this is a silly yet effective way to win supporters so I'm sure this is useful to people besides myself!

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o Also by kevin lyda


Display: Sort:
Spewing to dead tree format... | 41 comments (36 topical, 5 editorial, 0 hidden)
use some ML language (5.00 / 1) (#1)
by Anonymous 242 on Thu Jun 29, 2000 at 09:50:47 AM EST

For your report content use some device independant mark up language (sgml, html, xml, sdf, etc.) and then use some tool to tranform the mark up file into a device meaningful format (dvi, ps, etc.).

For a quick and dirty solution, output to html and use html2ps. A better solution would be output in sgml and then use sgmltools to spew forth postscript or dvi.

I've been thinking on this problem a while, and have done a moderate amount of research, but I haven't started working on an implementation of anything yet. I intend to get around to building a system to do something like this but between lack of funds to buy some decent reference books (on xml or sgml) and lack of time between all the other pots I've got on the stove, I haven't done anything besides give a cursory look to the docs for projects such as sdf and sgmltools.



Re: use some ML language (none / 0) (#19)
by kevin lyda on Thu Jun 29, 2000 at 03:16:00 PM EST

why the extra level of indirection? why not a markup api and then have it spit out postscript?

[ Parent ]
Re: use some ML language (none / 0) (#31)
by Florian on Fri Jun 30, 2000 at 04:35:06 AM EST

Of course you can use the Postscript module from CPAN, but then you would have to do most of the hard things on your own: adjusting columns, kerning between chars, linebreaking, hyphenation... It took Knuth several years to write TeX, you will not be able to implement a relevant subset of this functionality until monday.

[ Parent ]
Re: use some ML language (none / 0) (#32)
by Florian on Fri Jun 30, 2000 at 04:37:34 AM EST

Of course you can use the Postscript module from CPAN, but then you would have to do most of the hard things on your own: adjusting columns, kerning between chars, linebreaking, hyphenation... It took Knuth several years to write TeX, you will not be able to implement a relevant subset of this functionality until monday.

[ Parent ]
What about TeX? (4.00 / 1) (#2)
by tjansen on Thu Jun 29, 2000 at 09:58:37 AM EST

Using Postscript directly means a lot of (low-level) work and makes it really hard to change the layout. As someone else already suggested, one of the alternatives would be to use SGML/XML. The problem with that is that while XML would be the perfect format for the content, you would have to create stylesheets and this will be getting really difficult if you havent done this before (unless you use a commercial tool like FrameMaker). The alternative is to use TeX (or LaTeX), where you can have control over the layout while not having the deal with all those rendering issues.

Re: What about TeX? (none / 0) (#7)
by kevin lyda on Thu Jun 29, 2000 at 12:05:43 PM EST

i don't need or want to maintain postscript docs. reread the question. using the PostScript cpan module i can build a postscript document - but i can't do it with graphics.

[ Parent ]
Re: What about TeX? (none / 0) (#17)
by Cryptnotic on Thu Jun 29, 2000 at 02:41:59 PM EST

TeX is not PostScript. It is a printer-independant format for describing documents. TeX documents are rendered by the "tex" program to a .dvi (device independant) file, which is then converted to a postscript file for printing by dvi2ps.

[ Parent ]
Re: What about TeX? (none / 0) (#20)
by kevin lyda on Thu Jun 29, 2000 at 03:21:19 PM EST

ok, thanks. i actually knew that. now if i have my data, and i know i want to output to a postscript printer, why do i want to use tex?

let's try this scenario. i'm writing a NeXT application. i want to create a ui. NeXT uses display postscript as it's graphics engine - should i use tex and then "print" my ui to screen using dvi2dps? obviously not, i should use an api that spits out postscript.

look, i know markup languages are great and all, but that's not the tool for this problem. i want an api in some language that can communicate with mysql that i can use to build a document object with, and then i want it to dump itself as postscript.

think of the cgi object in perl - you build up a cgi object and then you dump it to the client (or you can do it in pieces).

[ Parent ]
Re: What about TeX? (none / 0) (#23)
by Qtmstr on Thu Jun 29, 2000 at 05:02:45 PM EST

look, i know markup languages are great and all, but that's not the tool for this problem. i want an api in some language that can communicate with mysql that i can use to build a document object with, and then i want it to dump itself as postscript. Still, what is wrong with TeX or LaTeX, in that case? TeX can be used to do exactly what you describe. Write a perl script to spit out TeX, compile it, and use dvips. I know you objected to that, but why? Also, offtopic: Why is display postscript used? What are its advantages? Wouldn't it be horribly slow?


Kuro5hin delenda est!
[ Parent ]
Re: What about TeX? (none / 0) (#24)
by kevin lyda on Thu Jun 29, 2000 at 05:45:50 PM EST

because i'd need to learn tex is part one. i need this done by monday. i remember tex (actually latex), i used it in uni 10 years ago, it wasn't horribly painful but then i never stuck graphics in it.

besides i need this report generator to run *quickly*. the design i have in mind is: query db about the merchants to bill; for each merchant {query db for transactions; build invoice with postscript api; pipe to lpr}. you're suggesting that i replace those last two steps with ... write raw tex to build invoice; write to file; run tex to build dvi file; run dvi2ps; run lpr. there are 1000 merchants now, there are plans to have many more soon. bye bye speed. plus there's a ton of error conditions to watch for (two extra forks, lots of disk activity, chances that my tex might fail to compile, etc).

i want to wow the suits but i already did all the hard work on the server. i don't want to blow it all on an invoice generator that takes three hours to run (vs. the current one that runs in under a minute).

[ Parent ]
Re: What about TeX? (none / 0) (#29)
by Chris Andreasen on Fri Jun 30, 2000 at 12:33:11 AM EST

i don't want to blow it all on an invoice generator that takes three hours to run (vs. the current one that runs in under a minute).
Just to defend LaTeX's speed, I just converted a 127 page LaTeX document (that's 127 pages of content, not code) to Postcript in about 50 seconds on my Pentium 133. The first time you have LaTeX and dvips convert to printable formats is extraordinarily slow because the font metrics have to be compiled, but every time after that it runs at a very decent speed. I would assume the documents you're looking to send would share the same fonts, graphics, layout, etc., so if you chose to write them up in LaTeX it might take ten to twenty minutes to get the first one off, then ten to twenty seconds for each additional one (unless you use LaTeX a lot on that machine, in which case the font metrics would already be compiled).
--------
Is public worship then, a sin,
That for devotions paid to Bacchus
The lictors dare to run us in,
and resolutely thump and whack us?

[ Parent ]
Re: What about TeX? (none / 0) (#30)
by ix on Fri Jun 30, 2000 at 03:37:43 AM EST

I teach LaTeX to freshmen whitout any computer skills and after two hours the generate beautiful reports whit graphics, tables, TOC and the works. LaTeX is the ideal tool for this. And If you want to you can have it working by monday morning and still catch a movie this weekend. Just make a template, have a perlscript query the db and output to a .tex file. Typeset it with latex2pdf and in less than 30 seconds on a decent pII you have a beautiful pdf looking just like you want it. The only hard part is making the temlate look like it should and thats not more than an hours work or two if you are a LaTeX novice.

[ Parent ]
Re: Display PS (none / 0) (#38)
by Matthew Weigel on Fri Jun 30, 2000 at 04:20:49 PM EST

Display PostScript pretty much guarantees that whatever shows up on the screen, shows up on the printer. Reasonably optimized, it's not horribly slow (it originally ran on 25MHz 68030's with a minimum of graphics processing power, and it was quite usable). It also allows high-level image description, which is preserved beyond the function calls and specific API used to create the image (since most everything can read PS). Text is also extremely readable, to the point that I can deal with my NeXT at a much higher resolution than my PC running OpenBSD.

Of course, it also gives users things like full-window drag much more easily than hacking it into the system after the fact. DisplayPDF is even cooler, since (as demonstrated by MacOS X) it provides some pretty transparency and image-warping effects (and if DisplayPS was usable on a 25MHz 68030, DisplayPDF on a 500MHz G4 can't be bad ;).


--Matthew Weigel
[ Parent ]
Perhaps pdf? (2.00 / 1) (#8)
by Anonymous Hero on Thu Jun 29, 2000 at 12:46:17 PM EST

I have had the same problem creating pretty reports out of a MySQL database. I use PHP (I don't know Perl) to suck the right data out of MySQL and to load a template file for the report that I have created in HTML.

PHP substitutes the variables into the template, writes it to a file, and then runs HTMLDOC (available from www.easysw.com), which creates a PDF. This pdf is then echo'd back to the browser.

It's not as elegant as XML and stylesheets, and it requires a reasonable amount of manual tweaking to get things looking right. And HTMLDOC (which is GPL) has some quirks--and it only allows for gifs or jpgs as images. But it works well, and the pdf files it generates are like 3-4k, so it's not a big deal to open them.

Hope that helps.

Re: Perhaps pdf? (or postscript) (none / 0) (#9)
by Anonymous Hero on Thu Jun 29, 2000 at 12:48:26 PM EST

Oh, one more thing. You can also generate postscript files with HTMLDOC (your choice, pdf or postscript).

[ Parent ]
PHP can do direct PDF output as well, I think. (none / 0) (#11)
by Anonymous Hero on Thu Jun 29, 2000 at 01:40:38 PM EST

Something else to think about.

[ Parent ]
HTML to PDF to browser? (none / 0) (#14)
by Anonymous Hero on Thu Jun 29, 2000 at 01:54:06 PM EST

Why do you turn HTML into a pdf if you're going to be looking at it on a browser anyway?

[ Parent ]
Re: HTML to PDF to browser? (none / 0) (#15)
by Anonymous Hero on Thu Jun 29, 2000 at 02:36:42 PM EST

So it prints properly--that was the point. HTML is just an easy intermediate format to create the template in. We use pdf becuase we want it to print how it looks (as opposed to HTML). The pdf is echo'd to the browser to make it easy to print.

[ Parent ]
Re: HTML to PDF to browser? (none / 0) (#16)
by Anonymous Hero on Thu Jun 29, 2000 at 02:36:43 PM EST

So it prints properly--that was the point. HTML is just an easy intermediate format to create the template in. We use pdf becuase we want it to print how it looks (as opposed to HTML). The pdf is echo'd to the browser to make it easy to print.

[ Parent ]
Use SDF (2.00 / 1) (#10)
by Alhazred on Thu Jun 29, 2000 at 01:37:09 PM EST

SDF is a program written in perl which implements a very straightforward markup language which is basically POD on steroids. The SDF program can turn sdf markup into PDF, postscript, DVI, LaTeX, POD, HTML, RTF, etc etc etc, and you can write your own conversion drivers (as well as extending the language itself very easily).

Anyway, one way to do what you want would be to generate SDF markup, and then use SDF to convert that to postscript for printing. Along the way you get a freebie, you can turn it into HTML (or whatever else). That way maybe if for some reason you want a web viewable version of the invoice you basically only need to change the output format specifier on the SDF program.

The only real hitch I had with it was that most of the drivers that come with SDF seem to rely heavily on Adobe FrameMaker to do the grunt work. However its possible to get around that in various ways, like converting to LaTeX and from there to postscript or something like that. Play with it, you'll see what I mean.
That is not dead which may eternal lie And with strange aeons death itself may die.
Re: Use SDF (none / 0) (#21)
by kevin lyda on Thu Jun 29, 2000 at 03:22:26 PM EST

url?

[ Parent ]
Re: Use SDF (none / 0) (#22)
by zavyman on Thu Jun 29, 2000 at 04:57:48 PM EST

http://www.mincom.com/mtr/sdf/

[ Parent ]
Adobe PDF format wins over the suits (3.00 / 1) (#12)
by cable on Thu Jun 29, 2000 at 01:43:35 PM EST

They may not have heard of Postscript/Ghostscript or other formats, but chances are they heard of Adobe Acrobat PDF files.

Try http://www.pdflib.com/ PDFLib for Unix and see about interfacing to it to generate those Adobe Acrobat files.

------------------
Only you, can help prevent Neb Rage!

Re: Adobe PDF format wins over the suits (none / 0) (#18)
by kevin lyda on Thu Jun 29, 2000 at 03:14:07 PM EST

ok, i'll check it out. however i want to *print* invoices. the suits want to see paper because i'm making *invoices* that are being sent to merchants. some guy in a cornershop in ballygonowhere will not be excited to see a pdf file.

believe it or not, computers are still used to create printed output.

[ Parent ]
Re: Adobe PDF format wins over the suits (none / 0) (#26)
by cesarb on Thu Jun 29, 2000 at 06:56:42 PM EST

PDF is just some hacked postscript. So you can treat it as if it were postscript -- view it with gv, parse it using ghostscript, or even pipe it to a postscript laser.

[ Parent ]
Re: Adobe PDF format wins over the suits (none / 0) (#27)
by hexmode on Thu Jun 29, 2000 at 06:59:20 PM EST

believe it or not, computers are still used to create printed output.

You do realize that PDF format is especially good for printing, don't you?

There are several Postscript and PDF libraries in CPAN. Do a search at http://search.cpan.org.

[ Parent ]

TeX and/or LaTeX (5.00 / 1) (#13)
by Cryptnotic on Thu Jun 29, 2000 at 01:46:17 PM EST

TeX and/or LaTeX. That's all you need to know. Makes amazingly beautiful looking output, and gives you complete control of how it will look. The syntax for doing tables is a bit weird at first, but once you figure it out, the tables can look really really good.

Plus, it's easy to output TeX from a perl script or a C program, as TeX is just a plain text format.

Re: TeX and/or LaTeX (none / 0) (#36)
by Morten Liebach on Fri Jun 30, 2000 at 06:22:37 AM EST

My first thought too, but it might be too slow for large voloumes, it never ceases to amaze me how much time it can take to compile a document!

It does look very very good though :-)

Morten
http://m.mongers.org/weblog/
[ Parent ]

TROFF (4.00 / 1) (#25)
by stbalbach on Thu Jun 29, 2000 at 06:17:06 PM EST

As it says in _Unix in a Nutshell_

"Troff is designed for typesetting text files for laser printers"

We used it for our invoicing needs and theres nothing it could not do and the output is excellent. It will handle logos and complicated typsetting requirements and can be integrated with all the other Unix tools via command line operation - we called it from C usng system()

Re: TROFF (none / 0) (#33)
by kevin lyda on Fri Jun 30, 2000 at 05:33:10 AM EST

can it do graphics?

[ Parent ]
Re: TROFF (none / 0) (#34)
by Paul Dunne on Fri Jun 30, 2000 at 06:00:12 AM EST

Umm, er, well, would pic(1) suffice? No? Ah. I think you have to hack the generated PS to add graphics, which is hardly ideal. Troff does produce 1st-rate output, but it's really text-only.
http://dunne.home.dhs.org/
[ Parent ]
Re: TROFF (none / 0) (#35)
by Paul Dunne on Fri Jun 30, 2000 at 06:06:18 AM EST

After reading more about your requirements, I can see why troff isn't going to hack it for you in any case. How well do you know Postscript? I mean, it seems to me your quickest path is to knock up a template file in PS, with placeholders, and when generating invoices do a regexp search-and-replace to put the DB info into a PS invoice file generated from the template (I say file but you can do all this with pipes).
http://dunne.home.dhs.org/
[ Parent ]
Real Men Use Postscript (none / 0) (#28)
by mikelieman on Thu Jun 29, 2000 at 09:39:00 PM EST

Coded with vi!

Get the book, postscript by example

peace
Mike

-- I Miss Jerry
In Python, try PIDDLE (none / 0) (#37)
by Anonymous Hero on Fri Jun 30, 2000 at 09:24:32 AM EST

If you're generating anything in Python (most of the other threads assumed Perl), you might want to look at piddle, a graphics library for python. It does all sorts of output, but most importantly, it does postscript and pdf. So you can do computer output as well as printed output.

There's quite a good example of using it for exactly this purpose in the O'Reilly book Python Programming for Win32.



Re: In Python, try PIDDLE (none / 0) (#39)
by Anonymous Hero on Fri Jun 30, 2000 at 04:27:59 PM EST

Man, and I thought Gimp was a piss-poor excuse for a program name.

[ Parent ]
Re: In Python, try PIDDLE (none / 0) (#41)
by kevin lyda on Sat Jul 01, 2000 at 04:20:17 AM EST

bingo.

thanks!

[ Parent ]
Spewing to dead tree format... | 41 comments (36 topical, 5 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!