<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Bijective BWT</title>
	<atom:link href="http://www.c10n.info/archives/721/feed" rel="self" type="application/rss+xml" />
	<link>http://www.c10n.info/archives/721</link>
	<description>All about the most recent compression techniques, algorithms, patents, products, tools and events.</description>
	<pubDate>Thu, 09 Sep 2010 08:50:55 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6</generator>
		<item>
		<title>By: David Scott</title>
		<link>http://www.c10n.info/archives/721#comment-350583</link>
		<dc:creator>David Scott</dc:creator>
		<pubDate>Mon, 14 Jun 2010 14:48:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-350583</guid>
		<description>There are many things BWTS could be used for. I also have a bijective distance coder. If I can get off my lazy ass and actually code the two together it would make a nice bijective compressor especially for short files where it would be a good compressor to use before an encryption pass. I have not written much here lately I thought this thread was dead.</description>
		<content:encoded><![CDATA[<p>There are many things BWTS could be used for. I also have a bijective distance coder. If I can get off my lazy ass and actually code the two together it would make a nice bijective compressor especially for short files where it would be a good compressor to use before an encryption pass. I have not written much here lately I thought this thread was dead.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jarek</title>
		<link>http://www.c10n.info/archives/721#comment-350522</link>
		<dc:creator>Jarek</dc:creator>
		<pubDate>Mon, 14 Jun 2010 07:54:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-350522</guid>
		<description>Thanks - it was only a comment to the book - there is a lot about BWT there, but I haven't seen this cheap generalization of this transformation: looking useful preparation for the main compression ...
About symbol grouping - my thought was that language evolution works with phonetics rather - so maybe it might be more effective to block groups of letters corresponding to phonemes ...
Generally for a supercompressors I would think about some dynamically evolving DMC-like structure for local correlations, 'supervised' by synchronously evolving NN searching for larger scale relations ... but I agree that it's huge work - I'm focusing on physics now and I'm generally too theoretical for such .. marathons :)
Have fun!</description>
		<content:encoded><![CDATA[<p>Thanks - it was only a comment to the book - there is a lot about BWT there, but I haven&#8217;t seen this cheap generalization of this transformation: looking useful preparation for the main compression &#8230;<br />
About symbol grouping - my thought was that language evolution works with phonetics rather - so maybe it might be more effective to block groups of letters corresponding to phonemes &#8230;<br />
Generally for a supercompressors I would think about some dynamically evolving DMC-like structure for local correlations, &#8217;supervised&#8217; by synchronously evolving NN searching for larger scale relations &#8230; but I agree that it&#8217;s huge work - I&#8217;m focusing on physics now and I&#8217;m generally too theoretical for such .. marathons :)<br />
Have fun!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Maloeran</title>
		<link>http://www.c10n.info/archives/721#comment-350479</link>
		<dc:creator>Maloeran</dc:creator>
		<pubDate>Mon, 14 Jun 2010 03:34:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-350479</guid>
		<description>Jarek, reodering characters to group them by typical context has been done, for text it does usually provide a small gain over plain BWT.

I don't think making groups of symbols would be a wise thing to do given the nature of BWT. With its string ordering, it exploits very well the prediction of a byte given all previous ones.

If you want to look at something fancier and a *lot* more computationally expensive than BWT, have a look at PPM or PAQ algorithms.

Have fun!</description>
		<content:encoded><![CDATA[<p>Jarek, reodering characters to group them by typical context has been done, for text it does usually provide a small gain over plain BWT.</p>
<p>I don&#8217;t think making groups of symbols would be a wise thing to do given the nature of BWT. With its string ordering, it exploits very well the prediction of a byte given all previous ones.</p>
<p>If you want to look at something fancier and a *lot* more computationally expensive than BWT, have a look at PPM or PAQ algorithms.</p>
<p>Have fun!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jarek Duda</title>
		<link>http://www.c10n.info/archives/721#comment-350365</link>
		<dc:creator>Jarek Duda</dc:creator>
		<pubDate>Sun, 13 Jun 2010 17:11:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-350365</guid>
		<description>Hi all,
I'm just reading Matt Mahoney's book and started appreciating BWT - my comment is that it's much more general transformation than just using standard (..ABCDE..) order among chars as I've seen in the book.
If we are fighting for a few bytes like in BBWT, while focusing on text in a given language, it could help a bit to use some better BWT, optimized for this language:
- DIFFERENT SYMBOL ORDER, like accordingly to phonetics: for example separating vowels and consonants (like ..AEBCD..) - thanks of it lexicographic order could group contexts a bit better and so the BWT could be less 'ragged'. It's cheap - just bijectively remap characters before and after BWT,
- DIFFERENT ALPHABET: maybe grouping a few symbols into a new one (like 'AE'), or splitting them (like vowel nr. 1 instead of 'A') could also make it a bit more efficient...

It would require a lot of experiments, but using BWT optimized for a given purpose could cheaply improve ratio a few percents(?)
Hope it help,
Jarek</description>
		<content:encoded><![CDATA[<p>Hi all,<br />
I&#8217;m just reading Matt Mahoney&#8217;s book and started appreciating BWT - my comment is that it&#8217;s much more general transformation than just using standard (..ABCDE..) order among chars as I&#8217;ve seen in the book.<br />
If we are fighting for a few bytes like in BBWT, while focusing on text in a given language, it could help a bit to use some better BWT, optimized for this language:<br />
- DIFFERENT SYMBOL ORDER, like accordingly to phonetics: for example separating vowels and consonants (like ..AEBCD..) - thanks of it lexicographic order could group contexts a bit better and so the BWT could be less &#8216;ragged&#8217;. It&#8217;s cheap - just bijectively remap characters before and after BWT,<br />
- DIFFERENT ALPHABET: maybe grouping a few symbols into a new one (like &#8216;AE&#8217;), or splitting them (like vowel nr. 1 instead of &#8216;A&#8217;) could also make it a bit more efficient&#8230;</p>
<p>It would require a lot of experiments, but using BWT optimized for a given purpose could cheaply improve ratio a few percents(?)<br />
Hope it help,<br />
Jarek</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Vivien Greene</title>
		<link>http://www.c10n.info/archives/721#comment-327557</link>
		<dc:creator>Vivien Greene</dc:creator>
		<pubDate>Wed, 10 Mar 2010 22:40:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-327557</guid>
		<description>Blog looks great and reads even better! You share some great opinions and insight here. Always looking for motivating blogs to keep mine going!</description>
		<content:encoded><![CDATA[<p>Blog looks great and reads even better! You share some great opinions and insight here. Always looking for motivating blogs to keep mine going!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Scott</title>
		<link>http://www.c10n.info/archives/721#comment-324898</link>
		<dc:creator>David Scott</dc:creator>
		<pubDate>Sun, 28 Feb 2010 16:23:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-324898</guid>
		<description>Yes it looks great and for simple strings  when you
have an index equal to zero your UNBWT is  the same
as UNBWTS so since not using an index for inverse on the bottom part you could always just do the UNBWTS since most of the time UNBWT does not exist. But When it does
its the same as UNBWTS
 Where you see the difference is if you BWT a string and
get an index of nonzero from your code.
When you UNBWTS the resultant string you get something else. That however when BWTed goes to a string with an index that is not zero.
go ahead do the BWTS and UNBWTS
David</description>
		<content:encoded><![CDATA[<p>Yes it looks great and for simple strings  when you<br />
have an index equal to zero your UNBWT is  the same<br />
as UNBWTS so since not using an index for inverse on the bottom part you could always just do the UNBWTS since most of the time UNBWT does not exist. But When it does<br />
its the same as UNBWTS<br />
 Where you see the difference is if you BWT a string and<br />
get an index of nonzero from your code.<br />
When you UNBWTS the resultant string you get something else. That however when BWTed goes to a string with an index that is not zero.<br />
go ahead do the BWTS and UNBWTS<br />
David</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kragen Javier Sitaker</title>
		<link>http://www.c10n.info/archives/721#comment-324883</link>
		<dc:creator>Kragen Javier Sitaker</dc:creator>
		<pubDate>Sun, 28 Feb 2010 13:37:21 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-324883</guid>
		<description>Hey, for people who aren't that familiar with the BWT, if you want to play with an interactive BWT (the standard version, not Scott's bijective version), I wrote one a while back in JS at http://canonical.org/~kragen/sw/bwt.html. You can fiddle with changing the BWTed string or saved index and watch the poor algorithm struggle to invert it.

I wonder if I'll get around to implementing the Scottified version. It looks like fun!</description>
		<content:encoded><![CDATA[<p>Hey, for people who aren&#8217;t that familiar with the BWT, if you want to play with an interactive BWT (the standard version, not Scott&#8217;s bijective version), I wrote one a while back in JS at <a href="http://canonical.org/~kragen/sw/bwt.html" rel="nofollow">http://canonical.org/~kragen/sw/bwt.html</a>. You can fiddle with changing the BWTed string or saved index and watch the poor algorithm struggle to invert it.</p>
<p>I wonder if I&#8217;ll get around to implementing the Scottified version. It looks like fun!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Scott</title>
		<link>http://www.c10n.info/archives/721#comment-314376</link>
		<dc:creator>David Scott</dc:creator>
		<pubDate>Tue, 22 Dec 2009 16:14:12 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-314376</guid>
		<description>Of course its impossible to tell why it was rejected. I am definitely not part of the old boy network. But when I see that really good papers and people being punished for not bowing to the gods of global warming don't get published its almost an honor not to get published. I don't know how good the paper is but I feel confident that the really good innovative papers never make the light of day first time up in todays political strangle hold on science. Fact its getting colder the last few years yet the so called scientists that are peer reviewed are doing there best to hid those facts.
 Buy the way the german guy mentions my name several times so in way I think I do get credit for the bijective BWT even though most idiots think the regular BWT is bijective.

 Sorry alan I don't get it what are you ranting about.</description>
		<content:encoded><![CDATA[<p>Of course its impossible to tell why it was rejected. I am definitely not part of the old boy network. But when I see that really good papers and people being punished for not bowing to the gods of global warming don&#8217;t get published its almost an honor not to get published. I don&#8217;t know how good the paper is but I feel confident that the really good innovative papers never make the light of day first time up in todays political strangle hold on science. Fact its getting colder the last few years yet the so called scientists that are peer reviewed are doing there best to hid those facts.<br />
 Buy the way the german guy mentions my name several times so in way I think I do get credit for the bijective BWT even though most idiots think the regular BWT is bijective.</p>
<p> Sorry alan I don&#8217;t get it what are you ranting about.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Maloeran</title>
		<link>http://www.c10n.info/archives/721#comment-314359</link>
		<dc:creator>Maloeran</dc:creator>
		<pubDate>Tue, 22 Dec 2009 11:51:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-314359</guid>
		<description>David,

That's ridiculous, the paper was rejected on the basis that you can just add an EOF symbol? This is so utterly stupid...

If you were really to first to implement a index-free bijective BWT, I sincerely hope you manage to get credits for it.</description>
		<content:encoded><![CDATA[<p>David,</p>
<p>That&#8217;s ridiculous, the paper was rejected on the basis that you can just add an EOF symbol? This is so utterly stupid&#8230;</p>
<p>If you were really to first to implement a index-free bijective BWT, I sincerely hope you manage to get credits for it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: alan</title>
		<link>http://www.c10n.info/archives/721#comment-314334</link>
		<dc:creator>alan</dc:creator>
		<pubDate>Tue, 22 Dec 2009 02:37:55 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-314334</guid>
		<description>Clue everyone:

what is "sound"?</description>
		<content:encoded><![CDATA[<p>Clue everyone:</p>
<p>what is &#8220;sound&#8221;?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: alan</title>
		<link>http://www.c10n.info/archives/721#comment-314333</link>
		<dc:creator>alan</dc:creator>
		<pubDate>Tue, 22 Dec 2009 02:35:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-314333</guid>
		<description>Re: Franc Jarnovic,

do you get it ? ("honour thy error" - Brian Eno;  musician)</description>
		<content:encoded><![CDATA[<p>Re: Franc Jarnovic,</p>
<p>do you get it ? (&#8221;honour thy error&#8221; - Brian Eno;  musician)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Scott</title>
		<link>http://www.c10n.info/archives/721#comment-314299</link>
		<dc:creator>David Scott</dc:creator>
		<pubDate>Mon, 21 Dec 2009 17:42:08 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-314299</guid>
		<description>It figures just when I said I have not heard from Yossi I get an email. So maybe we will do more next year.</description>
		<content:encoded><![CDATA[<p>It figures just when I said I have not heard from Yossi I get an email. So maybe we will do more next year.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Scott</title>
		<link>http://www.c10n.info/archives/721#comment-314290</link>
		<dc:creator>David Scott</dc:creator>
		<pubDate>Mon, 21 Dec 2009 15:55:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-314290</guid>
		<description>The paper was rejected. As some one commented you can do a BWT using a special EOF symbol so that an index is not necessary. 
 I don't think some of the reviews had a clue what it what it really meant. 
 Since most blocks of data regardless if you allow a free EOF or an index still don't have a reverse BWT.
 The concept was not well understood by the reviewers.
During the time we started writing this another guy
who asked to join us started writing us own that also
had bijective ST transfomrs of various orders. His paper
was accepted but maybe those running the conferences
in the Czech Republic have a better understanging of
just what the concept was anyway here is his paper

http://arxiv.org/abs/0908.0239

Have a nice day
dave

P.S. as for Gil I have not heard from him since
Oct of this year. But I emailed him earlier this
month and no reply.</description>
		<content:encoded><![CDATA[<p>The paper was rejected. As some one commented you can do a BWT using a special EOF symbol so that an index is not necessary.<br />
 I don&#8217;t think some of the reviews had a clue what it what it really meant.<br />
 Since most blocks of data regardless if you allow a free EOF or an index still don&#8217;t have a reverse BWT.<br />
 The concept was not well understood by the reviewers.<br />
During the time we started writing this another guy<br />
who asked to join us started writing us own that also<br />
had bijective ST transfomrs of various orders. His paper<br />
was accepted but maybe those running the conferences<br />
in the Czech Republic have a better understanging of<br />
just what the concept was anyway here is his paper</p>
<p><a href="http://arxiv.org/abs/0908.0239" rel="nofollow">http://arxiv.org/abs/0908.0239</a></p>
<p>Have a nice day<br />
dave</p>
<p>P.S. as for Gil I have not heard from him since<br />
Oct of this year. But I emailed him earlier this<br />
month and no reply.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Maloeran</title>
		<link>http://www.c10n.info/archives/721#comment-314220</link>
		<dc:creator>Maloeran</dc:creator>
		<pubDate>Sun, 20 Dec 2009 23:32:29 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-314220</guid>
		<description>Well done David!

Last time I played with BWT, I just thought that there had be a way to reconstruct the original sequence without that index... Of course, it wasn't that simple, but you solved it very nicely!

I believe I'll be reading the paper and trying to squeeze a few extra bytes out of my compressor, thanks David.</description>
		<content:encoded><![CDATA[<p>Well done David!</p>
<p>Last time I played with BWT, I just thought that there had be a way to reconstruct the original sequence without that index&#8230; Of course, it wasn&#8217;t that simple, but you solved it very nicely!</p>
<p>I believe I&#8217;ll be reading the paper and trying to squeeze a few extra bytes out of my compressor, thanks David.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David A. Scott</title>
		<link>http://www.c10n.info/archives/721#comment-296081</link>
		<dc:creator>David A. Scott</dc:creator>
		<pubDate>Wed, 08 Jul 2009 04:56:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.c10n.info/archives/721#comment-296081</guid>
		<description>Here is a draft of the paper on the bijective form of BWT

http://bijective.dogma.net/00yyy.pdf</description>
		<content:encoded><![CDATA[<p>Here is a draft of the paper on the bijective form of BWT</p>
<p><a href="http://bijective.dogma.net/00yyy.pdf" rel="nofollow">http://bijective.dogma.net/00yyy.pdf</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>
