osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UnknownColumnFamilyException after removing all Cassandra data


The node will not bootstrap if it is listed as a seed node.

-- Jacob Shadix 

On Tue, Feb 7, 2017 at 12:16 PM, Simone Franzini <captainfranz@xxxxxxxxx> wrote:
To further add to my previous answer, the node in question is a seed node, so it did not bootstrap. 
Should I remove it from the list of seed nodes and then try to restart it?


On Tue, Feb 7, 2017 at 9:43 AM, Simone Franzini <captainfranz@xxxxxxxxx> wrote:
This is exactly what I did on the second node. If this is not the correct / best procedure to adopt in these cases, please advise:

1. Removed all the data, including the system table (rm -rf data/ commitlog/ saved_caches).
2. Configured the node to replace itself, by adding the following line to cassandra-env.sh: JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address=<node own IP address>"
3. Start the node.

Noticeably, I did not do nodetool decommission or removenode. Is that the recommended approach?

Given what I did, I am mystified as to what the problem is. If I query the system.schema_columnfamilies on the affected node, all CF IDs are there. Same goes for the only other node that is currently up. Also, the other node that is currently up has data for all those CF IDs in the data folder.



On Tue, Feb 7, 2017 at 5:39 AM, kurt greaves <kurt@xxxxxxxxxxxxxxx> wrote:
The node is trying to communicate with another node, potentially streaming data, and is receiving files/data for an "unknown column family". That is, it doesn't know about the CF with the id e36415b6-95a7-368c-9ac0-ae0ac774863d.
If you deleted some columnfamilies but not all the system keyspace and restarted the node I'd expect this error to occur. Or I suppose if you didn't decommission the node properly before blowing the data away and restarting.

You'll have to give us more information on what your exact steps were on this 2nd node:

When you say deleted all Cassandra data, did this include the system tables? Were your steps to delete all the data and then just restart the node? Did you remove the node from the cluster prior to deleting the data and restarting it (nodetool decommission/removenode? Did the node rejoin the cluster or did it have to bootstrap?