c*e 发帖数: 17 | 1 I am new to cassandra and want to know how to design a security db.
In RDBMS, it's modeled like belows:
1. membership table uid, username, password,email,etc.
2. user table uid,firstname,lastname, phone,address,etc.
3. password history table pid,uid,passwordhashed,etc
4. roles rid, rolename, description
5. Permission peid,permissionname,description
6. role-permission-mapping rid,peid
7. user-role uid,rid4
some basic operations:
1. query if user has role
2. query if user has permission
3. CRUD permissions in role
4. CRUD user roles
Thanks. | w**z 发帖数: 8232 | 2 doesn't look like a good use case for C*
【在 c*e 的大作中提到】 : I am new to cassandra and want to know how to design a security db. : In RDBMS, it's modeled like belows: : 1. membership table uid, username, password,email,etc. : 2. user table uid,firstname,lastname, phone,address,etc. : 3. password history table pid,uid,passwordhashed,etc : 4. roles rid, rolename, description : 5. Permission peid,permissionname,description : 6. role-permission-mapping rid,peid : 7. user-role uid,rid4 : some basic operations:
| g*****g 发帖数: 34805 | 3 This is what I would do.
Permission CF, this may or may not be needed dependent on the usage.
Role CF with a list of permissions in json format
User CF with a list of roles and all other user field
3 CF total. If you need reverse look up, add index CF.
【在 c*e 的大作中提到】 : I am new to cassandra and want to know how to design a security db. : In RDBMS, it's modeled like belows: : 1. membership table uid, username, password,email,etc. : 2. user table uid,firstname,lastname, phone,address,etc. : 3. password history table pid,uid,passwordhashed,etc : 4. roles rid, rolename, description : 5. Permission peid,permissionname,description : 6. role-permission-mapping rid,peid : 7. user-role uid,rid4 : some basic operations:
| c*e 发帖数: 17 | 4 Thanks! I definitely need the Permission CF since I need to maintain the
list of the permissions.
Couple more questions regarding your design:
1. To check if user has some permissions:
first I need to select roles from users where rid=;
then loop through each roles returned, select permission from roles where
rid= and in permissionlist?
2. I do need a lot of reverse look up, for example, retrieve user by email,
certificate,username,etc. For each of those index CF, is the best practice
to store all the needed user attributes(basically all columns in user CF) in
it or just store the uid and then do two queries?
If store all information in the index CF again, isn't that a headache when
we need to update some user information?(if another programmer add some
other index CF, it's very easy to create update abnormally)
【在 g*****g 的大作中提到】 : This is what I would do. : Permission CF, this may or may not be needed dependent on the usage. : Role CF with a list of permissions in json format : User CF with a list of roles and all other user field : 3 CF total. If you need reverse look up, add index CF.
| w**z 发帖数: 8232 | 5 for C*, you better denormalize your data, then update is pain. C* doesn't
support join. And secondary index has performance issue. If you maintain too
many index CF yourself, it is going to be messy. Why do you want to use C*
for your usecase?
,
in
【在 c*e 的大作中提到】 : Thanks! I definitely need the Permission CF since I need to maintain the : list of the permissions. : Couple more questions regarding your design: : 1. To check if user has some permissions: : first I need to select roles from users where rid=; : then loop through each roles returned, select permission from roles where : rid= and in permissionlist? : 2. I do need a lot of reverse look up, for example, retrieve user by email, : certificate,username,etc. For each of those index CF, is the best practice : to store all the needed user attributes(basically all columns in user CF) in
| g*****g 发帖数: 34805 | 6 1. yes and no. A user has a list of roles. A role has a list of permissions.
I would model the former as a Json List in a single column, and the latter
in wide columns. 2 queries are fine, typically roles/permissions are loaded
in the session when user login and you don't need to redo it during the
entire session.
2. You can store the id only. If you see performance issue then you want to
denormalize and yes, update is a pain.
,
in
【在 c*e 的大作中提到】 : Thanks! I definitely need the Permission CF since I need to maintain the : list of the permissions. : Couple more questions regarding your design: : 1. To check if user has some permissions: : first I need to select roles from users where rid=; : then loop through each roles returned, select permission from roles where : rid= and in permissionlist? : 2. I do need a lot of reverse look up, for example, retrieve user by email, : certificate,username,etc. For each of those index CF, is the best practice : to store all the needed user attributes(basically all columns in user CF) in
| g*****g 发帖数: 34805 | 7 It depends on the use case. An app with 10M users high traffic and no
transaction support required, it may not be a bad idea. update is painful
but still doable. Scalability issue is solved upfront.
too
*
【在 w**z 的大作中提到】 : for C*, you better denormalize your data, then update is pain. C* doesn't : support join. And secondary index has performance issue. If you maintain too : many index CF yourself, it is going to be messy. Why do you want to use C* : for your usecase? : : , : in
| c*e 发帖数: 17 | 8
permissions.
latter
loaded
Can you elaborate a little bit why only 2 queries are needed? Don't you
still need to loop through the role to query the permissions?
to
【在 g*****g 的大作中提到】 : 1. yes and no. A user has a list of roles. A role has a list of permissions. : I would model the former as a Json List in a single column, and the latter : in wide columns. 2 queries are fine, typically roles/permissions are loaded : in the session when user login and you don't need to redo it during the : entire session. : 2. You can store the id only. If you see performance issue then you want to : denormalize and yes, update is a pain. : : , : in
| c*e 发帖数: 17 | 9 Hi, I don't have to but as you see I am learning and trying to figure out
why
before I make the decision. Thanks.
too
*
【在 w**z 的大作中提到】 : for C*, you better denormalize your data, then update is pain. C* doesn't : support join. And secondary index has performance issue. If you maintain too : many index CF yourself, it is going to be messy. Why do you want to use C* : for your usecase? : : , : in
| p*****2 发帖数: 21240 | 10
怎么不考虑一下mongo?
【在 c*e 的大作中提到】 : Hi, I don't have to but as you see I am learning and trying to figure out : why : before I make the decision. Thanks. : : too : *
| | | c*e 发帖数: 17 | 11 I haven't touched mongodb at all. What makes mongo different from Cassandra
in
my scenario? Can you explain how mongo can help to achieve my goal? Thanks.
【在 p*****2 的大作中提到】 : : 怎么不考虑一下mongo?
| p*****2 发帖数: 21240 | 12
Cassandra
Thanks.
先说说你为什么不用SQL,用Cassandra吧。
【在 c*e 的大作中提到】 : I haven't touched mongodb at all. What makes mongo different from Cassandra : in : my scenario? Can you explain how mongo can help to achieve my goal? Thanks.
| w**z 发帖数: 8232 | 13 you can do multiselect in C*, like rid in(...). C* tries to optimize the
multiselect, but it still issues multiple queris in parallel.
【在 c*e 的大作中提到】 : I haven't touched mongodb at all. What makes mongo different from Cassandra : in : my scenario? Can you explain how mongo can help to achieve my goal? Thanks.
| w**z 发帖数: 8232 | 14 just curious does Nflx store users in MySQL?
【在 g*****g 的大作中提到】 : It depends on the use case. An app with 10M users high traffic and no : transaction support required, it may not be a bad idea. update is painful : but still doable. Scalability issue is solved upfront. : : too : *
| w**z 发帖数: 8232 | 15 how many users are we talking about here? what is the load?
【在 c*e 的大作中提到】 : Hi, I don't have to but as you see I am learning and trying to figure out : why : before I make the decision. Thanks. : : too : *
| g*****g 发帖数: 34805 | 16 Depends what part of data you are talking about. Payment related is in MySQL
, most other data is in C*.
【在 w**z 的大作中提到】 : just curious does Nflx store users in MySQL?
| p*****2 发帖数: 21240 | 17
MySQL
ACL
【在 g*****g 的大作中提到】 : Depends what part of data you are talking about. Payment related is in MySQL : , most other data is in C*.
| g*****g 发帖数: 34805 | 18 There's no ACL in netflix user facing websites, as there's only one type of
user. Admin sites are in other apps behind firewall.
【在 p*****2 的大作中提到】 : : MySQL : ACL
| w**z 发帖数: 8232 | 19 So user profile info like name, age, gender etc. is stored in C*?
MySQL
【在 g*****g 的大作中提到】 : Depends what part of data you are talking about. Payment related is in MySQL : , most other data is in C*.
| g*****g 发帖数: 34805 | 20 I can't say for sure but probably.
【在 w**z 的大作中提到】 : So user profile info like name, age, gender etc. is stored in C*? : : MySQL
| | | p*****2 发帖数: 21240 | 21
probably ok. but I don't think eventual consistency is right choice for ACL.
【在 g*****g 的大作中提到】 : I can't say for sure but probably.
| g*****g 发帖数: 34805 | 22 Cassandra is tunable consistency. You can achieve atomicity on row level,
for most non-financial usage, it's probably workable. Cassandra is the
preferred datastore for all user data because we are doing active-active in
multiple region. And Cassandra supports multiple DC out of box.
http://techblog.netflix.com/2013/12/active-active-for-multi-reg
ACL.
【在 p*****2 的大作中提到】 : : probably ok. but I don't think eventual consistency is right choice for ACL.
| p*****2 发帖数: 21240 | 23
in
Sounds good. Why don't you recommend SQL for this use case?
【在 g*****g 的大作中提到】 : Cassandra is tunable consistency. You can achieve atomicity on row level, : for most non-financial usage, it's probably workable. Cassandra is the : preferred datastore for all user data because we are doing active-active in : multiple region. And Cassandra supports multiple DC out of box. : http://techblog.netflix.com/2013/12/active-active-for-multi-reg : : ACL.
| g*****g 发帖数: 34805 | 24 You mean OP's use case or Netflix's use case?
【在 p*****2 的大作中提到】 : : in : Sounds good. Why don't you recommend SQL for this use case?
| p*****2 发帖数: 21240 | 25
LZ's
【在 g*****g 的大作中提到】 : You mean OP's use case or Netflix's use case?
| g*****g 发帖数: 34805 | 26 No, I always recommend SQL until SQL is insufficient. I said if OP has 10M
users, it may not be a bad idea to use Cassandra, that's a big if.
【在 p*****2 的大作中提到】 : : LZ's
| p*****2 发帖数: 21240 | 27
I see. Do you think they need a cache layer if they need to check ACL for
every user operations?
【在 g*****g 的大作中提到】 : No, I always recommend SQL until SQL is insufficient. I said if OP has 10M : users, it may not be a bad idea to use Cassandra, that's a big if.
| g*****g 发帖数: 34805 | 28 Sure, user roles typically gets loaded in session and stay there for entire
session. You could also use memcache or redis to cache user session.
【在 p*****2 的大作中提到】 : : I see. Do you think they need a cache layer if they need to check ACL for : every user operations?
|
|