Friday, July 8, 2011

SQL joins graphically simplified...

SQL[structured query language] is the language to get data from a DB. It let you retrieve data from a simple "select *" to many customized ways. "JOIN" is the keyword to retrieve data from multiple tables with a given relationship "ON".

There are few more keywords used with JOIN, and some are optional. ex: INNER, OUTER,LEFT,RIGHT,FULL,CROSS,... those keywords depends on the DB you are using.

As given early JOIN give us the opportunity to retrieve rows which are related on two tables. So those could be able to show in a Cartesian Venn diagram. So i have created two sample tables and added sample data to those. lets see how this can be simulated.

First lets create sample tables:


CREATE TABLE tblleft ( id_customer int(10), customer_name varchar(10) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
insert into tblleft (id_customer, customer_name) values (1, 'amara');
insert into tblleft (id_customer, customer_name) values (2, 'dasun');
insert into tblleft (id_customer, customer_name) values (3, 'saman');
insert into tblleft (id_customer, customer_name) values (4, 'nimal');
insert into tblleft (id_customer, customer_name) values (5, 'kasun');


CREATE TABLE tblright ( id_order int NOT NULL AUTO_INCREMENT, id_customer int, equipment varchar(10) DEFAULT '_', PRIMARY KEY (id_order) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
insert into tblright (id_order, id_customer, equipment) values (1, 1, 'food');
insert into tblright (id_order, id_customer, equipment) values (2, 2, 'car');
insert into tblright (id_order, id_customer, equipment) values (3, 6, 'computer');
insert into tblright (id_order, id_customer, equipment) values (4, 7, 'fan');
insert into tblright (id_order, id_customer, equipment) values (5, 3, 'mobile');


1:
First lets do the simple join and retrieve data which are common with id_customer in the two tables 


SELECT
    *
FROM
    tblleft l
JOIN tblright r
ON
    l.id_customer=r.id_customer;

hers is the results:

2:

SELECT
    *
FROM
    tblleft l
LEFT OUTER JOIN tblright r
ON
    l.id_customer=r.id_customer;


3:
SELECT
    *
FROM
    tblleft l
RIGHT OUTER JOIN tblright r
ON
    l.id_customer=r.id_customer;



4:

SELECT
    *
FROM
    tblleft l
LEFT OUTER JOIN tblright r
ON
    l.id_customer=r.id_customer
WHERE
    r.id_order IS NULL;



5:
SELECT
    *
FROM
    tblleft l
RIGHT OUTER JOIN tblright r
ON
    l.id_customer=r.id_customer
WHERE
    l.id_customer IS NULL;



6:

SELECT
    *
FROM
    tblleft l
LEFT OUTER JOIN tblright r
ON
    l.id_customer=r.id_customer
UNION
SELECT
    *
FROM
    tblleft l
RIGHT OUTER JOIN tblright r
ON

    l.id_customer=r.id_customer;


7:
SELECT
    *
FROM
    tblleft l
LEFT OUTER JOIN tblright r
ON
    l.id_customer=r.id_customer
WHERE
    r.id_order IS NULL
UNION
SELECT
    *
FROM
    tblleft l
RIGHT OUTER JOIN tblright r
ON
    l.id_customer=r.id_customer
WHERE
    l.id_customer IS NULL;




8:Cross join will provide the Cartesian product , but it cant be drawn by a Venn diagram.

SELECT
    *
FROM
    tblleft l
JOIN tblright r;

or 

SELECT
    *
FROM
    tblleft l ,
    tblright r;





I hope you get some idea on how those result sets are described with Venn diagrams.
Thank you!

Wednesday, July 6, 2011

Prevent XSS and request forgery and other common attack patterns | [: GaB :]

It looks like two-third of the attacks are based on three vectors:
1. SQL injection (25%)
$id="1;DROP TABLE users"; mysql_query("SELECT * FROM bars WHERE id=".$id); It is deeply shocking how many "developers" still don't get the message not to execute SQL commands forged from user input. Or at least, why are they still employed? This attack would be the most simple to prevent. You just always have to *escape* strings which are parameters of the sql query coming in as request parameters. If in doubt, what do I mean by that, simply escape *all* parameters of a query. Or better use queries parameterized as "SELECT ... WHERE id=? AND type=?"-s, your language must have a way to pass the values safely afterwards.
But stop, why in the world are these guys still writing any SQL queries in the first place?! Because script kiddies don't know what a persistence framework is. Stop handcrafting CRUDDAOs.
Rule of thumb: always check. If you use any output mechanism or view, check whether it escapes or not. If not, it is *your job* to do so! (Error messages printing also the invalid value are the simplest to overlook.)
escapes by default.
* ${id} never escape by default. This is one of the most pithiest design decision in JSTL/EL. It is insecure by default. But of course you have the option to turn on security. Thank you very much. I understand the need for backward compatibility, but this is the kind of design problems which can be solved. Why not add an option to change the default behavior, and add fn:unEscapeXml() ?
<%@ taglib prefix="fn" uri="http://java.sun.com/jsp/jstl/functions" %> ${fn:escapeXml(id)} 
Plain EL expressions to produce output is pure evil.
* The output tags of your framework of choice is the most important to test. Never assume you already know it. In Spring 2.0 escaping in the form:input and form:errors tags were optional and off by default. In 2.5, it was switched to on by default in the documentation. But form:errors tag continued to print out unescaped output. In 3.0 everything seems to be fine by default in both.
Just to improve your chances always add this to your root context:
     defaultHtmlEscape     true  
3. Authentication and authorization (14%)
I guess mostly this happens by plain simply ignoring security logic. Most likely to happen when refactoring and ignoring to adopt security logic to the new business logic or domain structure. Those three should be independent by design. AOP or a security framework is a good candidate for separating the security concerns.
If you are not in control of the server side issues, use discovery tools like PhpSecInfo.
++1 Cross site request forgery a.k.a. CSRF and XSRF
Only 2% of the identified attacks fall into this category. But I also saw a lot of plain simply wrong attempts to fix this issue. This also shows how non-trivial it is for the average programmer to safeguard against this kind of attacks. It feels like many sites are vulnerable but not exploited yet.
To check the ingenuity of requests, first one must ensure, that GET requests are never changing state of the domain. (Trivial but not for everyone.)
Second, to always check POST requests' referer header values: they must match yours. This is sufficient for protection against XSRF, but has limitations.
Mostly suggested are tokens, a hidden field in every POST with a random variable identifying genuine requests (e.g. hash of the SESSIONID). The problem is, that every browser has a way to hack around it. You have to generate a new random token for each request to be used to fully safeguard yourself.
Alternatively you can change URLs dynamically (the token becomes part of the URL): for example you can use Spring's @PathVariable annotation bound to a @RequestParam("/action/{nextToken}").