osdir.com
mailing list archive

Subject: Re: scanner.next returns null - msg#00061

List: hbase-user-hadoop-apache

Date: Prev Next Index Thread: Prev Next Index

Hi JG,

String.valueOf(int).getBytes() worked out with 0.19.1, It seems the
Bytes.toBytes(int) work properly with with 0.20/TRUNK.

Thanks a lot,
Ramesh



Jonathan Gray-2 wrote:
>
> A scanner will return null if there are no rows to be returned. Are you
> certain that your setting this up correctly?
>
> Bytes.toBytes(int) yields the binary representation of the integer (4
> byte, big endian).
>
> String.valueOf(int) gives the ASCII representation of the integer (after
> .getBytes() you're getting 7 characters yielding 7 bytes).
>
> Both can be correct, but it can only be one or the other.
>
> I recommend the first because this gives you numerical sorted order. The
> second method will give you strange ordering with different length
> numbers, so you should use padding to get numerical ordering.
>
> Hope that makes sense.
>
> JG
>
> On Wed, July 1, 2009 12:10 am, Ramesh.Ramasamy wrote:
>>
>
>> Hi,
>>
>>
>> I'm trying to use any one of the RowFilterIntergace implementation, for
>> example PrefixRowFilter in my code.
>>
>> ('snpTbl' is the HTable reference)
>>
>>
>> int id = 3136824; PrefixRowFilter prf = new
>> PrefixRowFilter(Bytes.toBytes(id+1));
>>
>>
>> Scanner scanner = snpTbl.getScanner(Bytes.toByteArrays("info:"),
>> String.valueOf(id).getBytes(), prf);
>>
>>
>> RowResult snpRr = scanner.next();
>>
>>
>> I'm always getting the 'snpRr' as null, May some point me out what I am
>> doing wrong?
>>
>> TIA,
>> Ramesh
>> --
>> View this message in context:
>> http://www.nabble.com/scanner.next-returns-null-tp24285200p24285200.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>>
>>
>>
>
>
>

--
View this message in context:
http://www.nabble.com/scanner.next-returns-null-tp24285200p24354701.html
Sent from the HBase User mailing list archive at Nabble.com.

Find Apache Jobs at git.net
(osdir sister site)

Thread at a glance:

Previous Message by Date: (click to view message preview)

Re: HBase schema for crawling

Hi. I suggest you build an index with two cols. nextFetchDate, rowKey Only update the index with the newly fetched items and optimize every night or so. If I am not totally incorrect I think these days you have some index structure within HBase already ? Which means you might not need Lucene. Cheers //Marcus On Sun, Jul 5, 2009 at 11:26 PM, stack <stack@xxxxxxxxxx> wrote: > On Sat, Jul 4, 2009 at 5:21 PM, maxjar10 <jcuzens@xxxxxxxxx> wrote: > > > > > Hi All, > > > > I am developing a schema that will be used for crawling. > > > Out of interest, what crawler are you using? > > > > > > Now, here's the dilemma I have... When I create a MapReduce job to go > > through each row in the above I want to schedule the url to be recrawled > > again at some date in the future. For example, > > > > // Simple psudeocode > > Map( row, rowResult ) > > { > > BatchUpdate update = new BatchUpdate( row.get() ); > > update.put( "contents:content", downloadPage( pageUrl ) ); > > update.updateKey( nextFetchDate + ":" reverseDomain( pageUrl ) ); > // > > ???? No idea how to do this > > } > > > So you want to write a new row with a nextFetchDate prefix? > > FYI, have you seen > > http://hadoop.apache.org/hbase/docs/r0.19.3/api/org/apache/hadoop/hbase/util/Keying.html#createKey(java.lang.String)<http://hadoop.apache.org/hbase/docs/r0.19.3/api/org/apache/hadoop/hbase/util/Keying.html#createKey%28java.lang.String%29> > ? > > (You might also find http://sourceforge.net/projects/publicsuffix/ might > also be useful) > > > > > 1) Does HBase you to update the key for a row? Are HBase row keys > > immutable? > > > > > Yes. > > If you 'update' a row key, changing it, you will create a new row. > > > > > > > 2) If I can't update a key what's the easiest way to copy a row and > assign > > it a different key? > > > > > Get all of the row and then put it all with the new key (Billy Pearson's > suggestion would be the way to go I'd suggest -- keeping a column with > timestamp in it or using hbase versions -- in TRUNK you can ask for data > within a timerange. Running a scanner asking for rows > some timestamp > should be fast). > > > > > > > 3) What are the implications for updating/deleting from a table that you > > are > > currently scanning as part of the mapReduce job? > > > > > Scanners return the state of the row at the time they trip over it. > > > > > > > It seems to me that I may want to do a map and a reduce and during the > map > > phase I would record the rows that I fetched while in the reduce phase I > > would then take those rows, re-add them with the nextFetchDate and then > > remove the old row. > > > Do you have to remove old data? You could let it age or be removed when > the > number of versions of pages are > configured maximum. > > > > I would probably want to do this process in phases (e.g. scan only 5,000 > > rows at a time) so that if my Mapper died for any particular reason I > could > > address the issue and in the worst case only have lost the work that I > had > > done on 5,000 rows. > > > You could keep an already-seen in another hbase table and just rerun whole > job if first job failed. Check the already-seen before crawling a page to > see if you'd crawled it recently or not? > > St.Ack > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.herou@xxxxxxxxxxxxx http://www.tailsweep.com/

Next Message by Date: click to view message preview

Re: post-1606 world

Hi Rong-En, What do you mean HBASE_MANAGES_ZK is broken? HBase currently doesn't manage a full ZK Quorum, but will soon. This is HBASE-1551, which I am close to having working. We moved the zoo.cfg to the XML files to make things easier on users. If you have a ZooKeeper cluster setup, all you need to do is add the property "hbase.zookeeper.quorum". If you overwrote some of the defaults, say tickTime, then you need to add those properties as well, named e.g. hbase.zookeeper.property.timeTime. If you look at the bottom of hbase-default.xml, it will describe things in more detail. So, right now you still need to setup your own quorum ZK, and point HBase to it using "hbase.zookeeper.quorum". Soon you won't have to do your own setup of the ZK, but you'll always have to still have to fill out the "hbase.zookeeper.quorum" property. Hope that helps. -n On Sun, Jul 5, 2009 at 9:13 PM, Rong-en Fan <grafan@xxxxxxxxx> wrote: > After 1606 went in, how can I proper configure zookeeper and start it? > Tried it during the weekend, looks like current HBASE_MANAGES_ZK > is broken. In 1606, Nitay, mentioned there is a ZKServerTool? > > BTW, it seems to me that for hbase user, they don't need to worry about > what zookeeper really is? As 1606 wrappers zk servers in hbase-site.xml... > > Any ideas? > > Thanks, > Rong-En Fan >

Previous Message by Thread: click to view message preview

Re: scanner.next returns null

A scanner will return null if there are no rows to be returned. Are you certain that your setting this up correctly? Bytes.toBytes(int) yields the binary representation of the integer (4 byte, big endian). String.valueOf(int) gives the ASCII representation of the integer (after .getBytes() you're getting 7 characters yielding 7 bytes). Both can be correct, but it can only be one or the other. I recommend the first because this gives you numerical sorted order. The second method will give you strange ordering with different length numbers, so you should use padding to get numerical ordering. Hope that makes sense. JG On Wed, July 1, 2009 12:10 am, Ramesh.Ramasamy wrote: > > Hi, > > > I'm trying to use any one of the RowFilterIntergace implementation, for > example PrefixRowFilter in my code. > > ('snpTbl' is the HTable reference) > > > int id = 3136824; PrefixRowFilter prf = new > PrefixRowFilter(Bytes.toBytes(id+1)); > > > Scanner scanner = snpTbl.getScanner(Bytes.toByteArrays("info:"), > String.valueOf(id).getBytes(), prf); > > > RowResult snpRr = scanner.next(); > > > I'm always getting the 'snpRr' as null, May some point me out what I am > doing wrong? > > TIA, > Ramesh > -- > View this message in context: > http://www.nabble.com/scanner.next-returns-null-tp24285200p24285200.html > Sent from the HBase User mailing list archive at Nabble.com. > > >

Next Message by Thread: click to view message preview

Re: help with map-reduce

Hi, I am using Eclipse 3.3, JDK 1.6.0_12 and Hadoop/Hbase 0.19.1. On coding using some of the filter classes, eclipse hangs, and have no other option to continue it unless kill/restart the process. Does any body figured it out the problem and have a fix? TIA, Ramesh -- View this message in context: http://www.nabble.com/help-with-map-reduce-tp22925481p24289040.html Sent from the HBase User mailing list archive at Nabble.com.

Web Hosting Reviews from OSDir.com Sister Site iBizWebHosting.com

Home | News | Patents | Sitemap | FAQ | advertise | OSDir is an Inevitable website. GBiz & git.net are too!

Advertising by