[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

New Threads (Was...)

Joe Cooper wrote:

> I don't know what H-H you're reading but the one I'm looking at is
> exclusively made up of DocBook information.

I don't know which one you're talking about. I've
never seen an announcement beyond the one on
LinuxDoc.org, and Mark never announced new
versions. The only ones I've seen are as I
described. I'd certainly like to see the latest
version. What's the URL?

> 2. A button for volunteer SGML'ers to click, saying "I am willing to
> DocBookize HOWTO's for any comers.  Here is my email address."

I like that idea the best, although it might be
modified by having a count beside each name so
that we get better load distribution. A list with
e-mail addresses allows personal contact between
the writers and the volunteers without undue load
on the list. The list should have a FAQ, published
once a month, and one of the things in the FAQ
should be the location of the volunteer list.
Usenet groups do this regularly (FAQs).

> Good job.  I haven't looked at your example, but I imagine it will be
> quite helpful to have open in the window next to your editing window so
> you can get a nice overall view of things.

Mark has got the whole thing on his web site at

There's both a program listing showing the raw
SGML and a graphic showing the structure.

> We need a subset, and
> the template marks out that subset pretty well.

Nothing has been defined as a subset. In order to
have a subset, you need something that says,
baldly, "Here is the subset of DocBook 3.1 tags
LDP uses. Here's another subset that are OK to
use, but we'll ignore (as in search) them. Here's
the remaining tags that we would prefer you never

> One could argue that the extraneous text is needed to make it clear what
> each section and tag is really about.  But I agree that it could
> probably stand to be shrunken somewhat.

The structure isn't clear. See the Graphic I
referenced above. It also isn't clear which tags
are required. The H-H should be coordinated with
the template.

> Ok.  But are people really confused at this point about which is which?
> I guess if they are, then the LinuxDoc template should be moved to an
> archive merely for posterity.

I'm not confused now, but I certainly was when I
started out. Here were these two DTDs and no clear
statement which was preferred. I had to get an
SGML source from the archives (the H-H as a matter
of fact) before I could tell what was really going

> Again, Gary, you are imagining a scenario that just does not exist.  The
> policy has been agreed upon.  No minimized tags.  There is no reason
> handwritten SGML can't adhere strictly to the letter and intent of the
> DocBook/LDP guidelines.  You're allowing your own prejudices regarding
> standard Unix tools to cloud your vision.

Answering a later misconception, I'm not angry at
the rugged individualists, but human beings make
mistakes or exercise their creativity in
unexpected circumstances. If you want a standard
to be adhered to, the way to do it is not by
admonishing people to do it right, but to have
either a correct by construction toolset or (maybe
and) a filter that weeds out the non-conformaties.
I'm all for a verbose filter - I don't like
go/nogo strong silent types.

The filter should do three things:
1) It should do a grammar scan so the DocBook
grammar is right.
2) It should note when required tags are not
3) It should note when tags outside the allowed
subset are present.

The filter is because I know human nature is not
going to change, and we'll always have rugged
individualists with us. So the answer is to find
toolsets that will do what we want, and still
provide a filter so that those who don't want to
use the toolsets can quietly vet themselves. The
filter can be the subject of ongoing debate.

> I brought up DocBook:TDG because it is an excellent reference and one of
> it's points was that minimized markup can be expanded easily with
> sgmlnorm.  You stated your anger towards people who handwrite SGML and I
> told you of an easy way for even laxy handwriters to produce code to the
> letter and intent of the LDP policy.  That solution has not been heeded
> by you so far...but I offered.
> Regardless of that, there is no reason not to recommend people take a
> gander at TDG.  It is a good book.  DocBook is not a confusing markup
> language, and there is no reason someone can't gain valuable knowledge
> and understanding from a few minutes of browsing through relevent
> sections of TDG while composing a HOWTO.

I agree that it's a useful reference. My concern
was that it was being offered as a substitute for
the hard work of deciding on subsets.

> Finally, when the subset has been agreed upon, we should codify it and
> put it on the website in an easy to find spot.  Let's start talking
> about that subset.  I haven't seen a thread on it yet...shall we start
> one? 

Yes. How about "Subsets". We need three -
Required, permitted, and searchable.

> Perhaps we should begin by stripping all tags from the template
> (and example.sgm?) and annotate them?  Is that a good start for defining
> our subset?

It's a start, but the issue of searching needs to
be dealt with.

> Ok.  We all know that.  Now let's work on doing something about it.

I've since decided that the grammar of DocBook is
more complicated than a simple awk script can
handle. I think that a lexx/yacc (bison on Linux)
parser working from a modified DTD layout file is
the way to go. The layout file will do two things:
1) embody the grammar of DocBook, 2) identify the
subsets used by LDP.

> This is already the agreed upon policy.

If it is, some people haven't got the word. Also,
the reasons are not clearly stated anywhere, or at
least I don't know where.

> > 2) Define a set of required tags
> > that will be used for search.
> Ok.  Do you want to start the new thread on this, or shall I?  Where do
> we start...with index tags? section tags? something else?  What are the
> good tags to use for intelligent context sensitive searches?

We need required structure tags (like
<sect1>,<Article> etc.) required identification
tags (like <Author> and subsidiary tags), required
history tags (like <RevisionHistory> and
subsidiary tags), search tags (like keyword
lists), indexing tags (I'm not sure what they are,
but they should mark points in the text. Maybe
link tags.) Deprecated tags. Other tags that are
OK, but not special. Whichever of us gets to it
> Or are we coming from the wrong direction...should the search tool know
> about most of the relevant tags (i.e. "example" tags, "code snippet"
> tags, etc.) and allow a search based on any of those criteria?

We have some existing examples. Many of the
library search engines do key word searches, title
searches, author searches, abstract full text
searches. In addition we can pull program listings
and other special items. It's probably a good
thread subject "Search Criteria".

> > 3) Put together an
> > on-line thesaurus of keywords.
> Ok, I'm seen a Glossary suggested, but no thesaurus suggestion so far.
> Why a thesaurus?

A glossary would make a good howto. I suggested a
thesaurus because keywords can get out of hand. A
thesaurus would do two things: authors could avoid
new keywords if one already existed that met their
requirements. People doing searches could find out
which keywords were likely to hit their subject.
> Let's get those search terms settled on and we'll make sure that every
> HOWTO that has a glossary uses those tags for glossary items...so that a
> LDP-wide meta-dictionary can be constructed.
> Good idea Gary.  Let's get on it.  Sounds like another thread to me.

Glossary? We've already got Search Criteria.

> 4) The question of
> > referring to other howtos is intimately connected
> > with the search and display problems.
> Yep.  Gotta figure out what to do directory wise.  Already a thread
> going on about this and internal referencing.

Which one?

> > 5) Define
> > (E.g. select from the DocBook DTD and publish)
> > indexing tags to allow for "go to" display of
> > cross-referenced or searched documents. At the
> > moment, HTML looks like the only format amenable
> > to this function.
> I don't get what you mean?  HTML is the only HyperText capable mode
> we've got (other than the very limited PDF), true.  But I think that's
> really where things like an intelligent search needs to happen anyway.
> Our only problem in this regard is making the search on the SGML using
> intelligent tag reading and then placing the reader into the correct
> spot in the HTML online and making it seamless.

Somehow we have to come up with a way to identify
a location in the SGML with a location we can tell
a viewer to go to. Otherwise we're stuck with
referencing the HOWTO as a whole.

A marker in the SGML probably has to transfer
invisibly to the database view, so we can tell the
viewer to go there.

> > We'll need a scanner that parses incoming
> > SGML and annotates it for rugged individualists so
> > their submissions can be corrected to meet
> > requirements. There isn't any reason this can't be
> > made available to anyone who wants it, and there's
> > no reason it can't be something as simple and
> > understandable as an awk script.
> Get real Gary.  An automatic SGML tool can be used to generate
> deprecated tags, extraneous tags (that aren't in the LDP subset) and
> overall ugly code, just as much as a rugged individualists' vi can.  Get
> off the GUI high horse Gary.

There's two ways to do this. Correct by
construction, and correct by repeated fixups. I
leave it to you to decide which has the least
waste motion.

> Now, if you'd like for there to be a script to correct everyone's
> mistakes, I won't try to stop you...But really, I think you're way too
> worried about rabid vi users bringing anarchy to the LDP.  Brother, I
> gotta tell you, the LDP was built with vi and Emacs (with no SGML mode,
> in the beginning) long before either of us were around.

It isn't really difficult. The grammar of DocBook
is simple. A pushdown stack recognizer can
accommodate the two states an open tag can be in.
Open and unreverted and open and reverted. These
states govern which tags are legal as children,
and the legal tag lists can be truncated to the
LDP subset.

SGML is actually a simple language. It's far less
complicated than C or Pascal.


To UNSUBSCRIBE, email to ldp-discuss-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org