Little countries, big domains

Browsing through .al domains, I realized that the .al domains are being registered from almost every big corporation. Google (.com.al), Microsoft (.com.al), Facebook, Twitter, Ebay and a bunch of other companies own at least one .al domain. Some of them purchased the domains directly (host.al has the biggest market share when it comes to .al registration), and other have chosen to register through Domain Protecting companies such as MarkMonitor. While this might be part of a strategy to protect brands in every country, there are also some domains qualified as domain hacks which are quite a trend in the .al zone.

Matt Mullenweg of WordPress has registered nav.al and extern.al for example.
Some more projects based on .al domains:

Tid.al is a platform that identifies the best content from tens-of-thousands of the best online contributors and connects them to publishers or brands. The company powers fashion look-books for Teen Vogue, Zagat and Neutrogena. For Conde Nast, publisher of Teen Vogue, the platform produces ten times the editorial-quality content for 1/5th the cost of a staff editor. The platform was developed by founder Matthew Myers.(as described in TechCrunch)

Optim.al is the leading multivariate ad platform for Facebook and is the creation of the technology team at Optimal, Inc., based in San Francisco. Optimal, Inc. is the first advertising technology company to be both an approved Facebook Ads API Tools vendor and have built its own robust real-time bidding and audience infrastructure, integrated with all major ad inventory sources. (Cool Hack: Be.Optim.Al)

Surre.al provides free to play augmented reality 3D gaming content to the consumer. Surre.al’s location enabled rich media platform allows brands and advertisers to engage the consumer with seamlessly integrated interactive advertising. Think Cap’n Crunch leaping off a cereal box into your iPhone to play with your kid.

A hole bunch of other domain hacks are registered in the .al zone (ide.al, sport.al, dailyde.al, port.al, livingsoci.al etc) but they are not active yet. If you know a startup or an interesting company with a .al domain, let me know: I will update this list.

Update 1: There are a few .al domains used as URL shorters: I was informed of the Visu.al, Payp.al and Get.al. Thx Olgi for drawing attention to this category.

Update 2: Equ.al is also registered and for sale no longer for sale. It can create great hacks such as http://we.are.equ.al

Update 3: Sport.al was sold for 7500$ in June 2012

SPARQL / RDF and PHP

While it looks that all the development tools dealing with RDF and SPARQL are pretty much Java Oriented, there are also some nice libraries and tools that can help any ‘PHP lover’ easily test and start working with simple RDF systems and query them.

I found the following article “Dead Simple: RDF and SPARQL using PHP” that uses the librdf to implement a simple storage and query it. In addition the ARC project has a nice Wiki and some tutorial on how to implement a simple storage, a SPARQL endpoint any many other aspects.

OpenDNS vs. Google DNS vs. Norton DNS

There are a few posts around the internet aiming to push users to change their DNS for faster internet experience. Although this might be true in very (very) few cases, I believe the closest DNS server to you, are the fastest. That means your local ISP DNS servers. ISPs today serve to millions of DNS requests everyday. The fastest solution for DNS providers like Google would be to provde local DNS servers to your ISP (mainly through Akamai), but this is not always the case, especially if you are located outside US.

Nonetheless there are some cases when you still want to change the DNS server. Personally for me it was the number of advertising pages I was getting once a domain dns could not be resolved. My local ISP was practically making use of those bad requests and instead of a “server not found error”, it was pushing me toward a Google-Ads page which might help beginners, but upset more internet skilled users.

A few years ago I decided to use OpenDNS and just as I was getting satisfactory results, I started noticing that the service still failed to provide a transparent service. If a domain cannot be found, the service was still redirecting me to a search page with search results and advertising.

Then I decided to switch to Google DNS. Google DNS served as fast results as the Open DNS. Although Google claims that this service is better because it has no ads or redirection, recent policies changes in Google and their being the largest advertising company in the world makes people uncomfortable. Knowing that “big-brother” is watching them not only while they search, but also while their just browse the internet provides an uncomfortable feeling.

A few weeks ago I run into the Norton DNS service. Norton DNS is still in beta, but they provide for free a great service which not only resolves domains, but also blocks domains that contain malicious content or unwanted content such as pornographic information. These same services are also offered by the OpenDNS in their premium packages, but Norton offers them for free (at least for now).

Norton offers DNS service depending on the desired protection policy:

A – Security (malware, phishing sites, scam sites and web proxies)
Basic internet protection
198.153.192.40
198.153.194.40
B – Security + Pornography
Protection + filtering of Pornographic content (you know, all those unwanted sites and whatever undesired pictures pop-up to married guys)
198.153.192.50
198.153.194.50
C – Security + Pornography + Non-Family Friendly
Basically paranoid filtering for Grandma
198.153.192.60
198.153.194.60

Google DNS Servers.
8.8.8.8
8.8.4.4

Open DNS Servers:
208.67.222.222
208.67.220.220

I would recommend Norton DNS to anyone for the moment, although it is a beta service. Some restrictions apply as well for those willing to use a Juniper VPN (you will not be able to login to VPN if you use their DNS), but these are all trivial problems.

Crazy sales on "Social" domains

There are still those thinking that domains are being overestimated. And yet another domain sale that hits the record. Social.org was sold for 228,600$ on NameJet. This may be considered a big sale, but it is still less then 10%  of the price payed for the social.com domain a year ago in an auction. Social.com was sold for 2,600,000$ on a SnapNames auction. Beside these sales, another domain hack soci.al was sold for 50,000$ just a year ago.

It is reported that for the social.org domain, a total of 415 bidders signed up for the heavily promoted auction. This means that the interest in the actual trends in Internet will be always high. If you get a feeling what the content in the next internet will be about, than this is the moment to get a good domain.

Implementing a Canonical URL with Zend Framework

Motivation

Unique - Courtesy Irina Souiki
Unique - Courtesy Irina Souiki

Google and the other search engines are trying to convince the webmasters to use the “what so called” Canonical URLs. The Canonical URLs will help the search engines distinguish the dublicated content which comes from different calls on the same domain. For example :

lead all to the same information, the entry point to this portal. Although a person does not care what he typed as long as he gets the information he expects, a search engine will get some(!?) confusion. The Search engine will get the same identic result, from the same domain, in 3 different URLs. Which one should the search index?!

Canonical URL Explained
Canonical URL Explained

The Canonical URL is just a simple “link tag” added to the header of your page. This link gives the owner of the page the power to tell Search Engines which one is his favourite URL for his page.

<link rel=”canonical” href=”http://www.avhumboldt.net/index.php” />

The above example is borrowed from Official Google WebMaster Central Blog and some more information on the canonical Url can be found on the article: Specify your canonical. Although the Blog of Google which is advertising this article does not include a “canonical link” (funny huh), it is a nice practice to have this feature on your website.

Implementation

If your project runs on Zend Framework, it is very easy to create canonical link by retrieving the controllers/actions and Parameters that you are using in the current URL. (And if you don’t have experience with ZF, the rest of the post will look chinese to you) If we have a closer look to the Canonical URLs, we will realize that we have to decide which domain should we use for the content to be shown. The rest of the parameters (the second part of the URLs) is the same as what it is shown on the page, in some cases we will just need to remove some arguments (like color/red/). In other words, if my URL is

http://www.avhumboldt.net/humboldt/publications/books/did/25/title/Aspects-of-Nature

or

http://avhumboldt.net/humboldt/publications/books/did/25/title/Aspects-of-Nature

I have to decide via Canonicals which one is my domain of choice and tell the Search Engines to use that one (The rest is handled by the search engine, we don’t care anymore). So all we need to do, is insert a

<link rel=”canonical” href=”http://www.avhumboldt.net/humboldt/publications/books/did/25/title/Aspects-of-Nature” />

in the header of the page. If we see the href in the canonical above, it can be divided in 2 parts, the domain name, (together with the subdirectory where I have placed my project) and the parameters which decide the content.

  1. Domain Name (+ subdirectory): www.avhumboldt.net/humboldt
  2. Parameters: /publications/books/did/25/title/Aspects-of-Nature

Since my parameters are always the same, in canonical URLs we should determine only the domain name we prefer and place the parameters afterwards. For those that are familiar with the Zend Framework MVC, the parameters of the URL are composed of /controller/action/parameters+. More information can be found in the Zend Framework Documentation. A quick solution should be by using:

“http://www.avhumboldt.net/”.$_SERVER[“REQUEST_URI”];

as the Canonical Link. The $_SERVER[“REQUEST_URI”] will return the URI which was given in order to access a page. Although this looks easy it is not recommend to anyone for use “as it is”. It will lead to some security issues with your website. (There are a lot of posts and resources out there about parameter security). A better approach to have the canonical URL is to recreate the full parameters used in the URL. This can be easily done by using Zend Frameworks Request Object. Within a Controller in Zend you can call:

$this->getRequest()->getControllerName() – to return the Controller name $this->getRequest()->getActionName() – to return the action name

and

$this->getRequest()->getParams() – to return an array with the parameters used in the URL

So we can get the Controller/Action straight forward by calling the getControllerName() and getRequestName(). We will need a little function to retrieve and place in a string all the parameters/values which are stored in the getRequest()->getParams().

public function canonicalUrl()
{
$request = Zend_Controller_Front::getInstance()->getRequest();
$filter = new Zend_Filter_Alnum(true);
$params = array();
foreach($request->getParams() as $key => $value) {
if(in_array($key, array(‘controller’, ‘action’, ‘module’))) {
continue;
}
array_push($params, $key . ‘/’ . $filter->filter($value));
}
return implode(‘/’, $params);
}

Once we have all the parameteres ordered in a /varname/value/varnam2/value… fashion, all we need to do is mix them in a Canonical URL and the best way is to create a view variable in the predispatch method of my controller:

The view variable is created for all the actions of the controller and can be accessed by any View script (those .phtml files under the view/scripts/controllername folder) Inserting them in the page is as easy as calling:

$this->headLink ()->headLink(array(’rel’ => ‘canonical’, ‘href’ => $this->canonicUrl), ‘PREPEND’);

The code above is used in the View Scripts of Zend Framework and it will create a link tag which can called from the main layout. In the Main Layout (it should be main.phtml by default) just add:

echo “nr”.$this->headLink().”nr”;

some where in the <head> section. You should have some nice canonical URLs in every page generated by your controller. A better way should be to create a plugin to have the canonical Urls for every controller, but this is what I needed so far.

List of English Stop Words

Stop Words

Stop Words are words which do not contain important significance to be used in Search Queries. Usually these words are filtered out from search queries because they return vast amount of unnecessary information. A better definition is provided below:

“Words that do not appear in the index in a particular database because they are either insignificant (i.e., articles, prepositions) or so common that the results would be higher than the system can handle (as in the case of IUCAT where terms such as United States or Department are stop words in keyword searching.) Stop words vary from system to system. Also, some systems will merely ignore stop words where use of stop words in other systems will result in retrieving zero hits. ”

http://www.iusb.edu/~libg/instruction/helpguide/handouts/2005Boolean.shtml

Since I needed to use them in a project (Humboldt Diglital Library and Network), I am posting here a list of English stop words, and below a PHP array containing these words

Here is a list of english stop words:

a
about
above
across
after
afterwards
again
against
all
almost
alone
along
already
also
although
always
am
among
amongst
amoungst
amount
an
and
another
any
anyhow
anyone
anything
anyway
anywhere
are
around
as
at
back
be
became
because
become
becomes
becoming
been
before
beforehand
behind
being
below
beside
besides
between
beyond
bill
both
bottom
but
by
call
can
cannot
cant
co
computer
con
could
couldnt
cry
de
describe
detail
do
done
down
due
during
each
eg
eight
either
eleven
else
elsewhere
empty
enough
etc
even
ever
every
everyone
everything
everywhere
except
few
fifteen
fify
fill
find
fire
first
five
for
former
formerly
forty
found
four
from
front
full
further
get
give
go
had
has
hasnt
have
he
hence
her
here
hereafter
hereby
herein
hereupon
hers
herse"
him
himse"
his
how
however
hundred
i
ie
if
in
inc
indeed
interest
into
is
it
its
itse"
keep
last
latter
latterly
least
less
ltd
made
many
may
me
meanwhile
might
mill
mine
more
moreover
most
mostly
move
much
must
my
myse"
name
namely
neither
never
nevertheless
next
nine
no
nobody
none
noone
nor
not
nothing
now
nowhere
of
off
often
on
once
one
only
onto
or
other
others
otherwise
our
ours
ourselves
out
over
own
part
per
perhaps
please
put
rather
re
same
see
seem
seemed
seeming
seems
serious
several
she
should
show
side
since
sincere
six
sixty
so
some
somehow
someone
something
sometime
sometimes
somewhere
still
such
system
take
ten
than
that
the
their
them
themselves
then
thence
there
thereafter
thereby
therefore
therein
thereupon
these
they
thick
thin
third
this
those
though
three
through
throughout
thru
thus
to
together
too
top
toward
towards
twelve
twenty
two
un
under
until
up
upon
us
very
via
was
we
well
were
what
whatever
when
whence
whenever
where
whereafter
whereas
whereby
wherein
whereupon
wherever
whether
which
while
whither
who
whoever
whole
whom
whose
why
will
with
within
without
would
yet
you
your
yours
yourself
yourselves

And here is a php array with stop words:
$stopwords = array("a", "about", "above", "above", "across", "after", "afterwards", "again", "against", "all", "almost", "alone", "along", "already", "also","although","always","am","among", "amongst", "amoungst", "amount",  "an", "and", "another", "any","anyhow","anyone","anything","anyway", "anywhere", "are", "around", "as",  "at", "back","be","became", "because","become","becomes", "becoming", "been", "before", "beforehand", "behind", "being", "below", "beside", "besides", "between", "beyond", "bill", "both", "bottom","but", "by", "call", "can", "cannot", "cant", "co", "con", "could", "couldnt", "cry", "de", "describe", "detail", "do", "done", "down", "due", "during", "each", "eg", "eight", "either", "eleven","else", "elsewhere", "empty", "enough", "etc", "even", "ever", "every", "everyone", "everything", "everywhere", "except", "few", "fifteen", "fify", "fill", "find", "fire", "first", "five", "for", "former", "formerly", "forty", "found", "four", "from", "front", "full", "further", "get", "give", "go", "had", "has", "hasnt", "have", "he", "hence", "her", "here", "hereafter", "hereby", "herein", "hereupon", "hers", "herself", "him", "himself", "his", "how", "however", "hundred", "ie", "if", "in", "inc", "indeed", "interest", "into", "is", "it", "its", "itself", "keep", "last", "latter", "latterly", "least", "less", "ltd", "made", "many", "may", "me", "meanwhile", "might", "mill", "mine", "more", "moreover", "most", "mostly", "move", "much", "must", "my", "myself", "name", "namely", "neither", "never", "nevertheless", "next", "nine", "no", "nobody", "none", "noone", "nor", "not", "nothing", "now", "nowhere", "of", "off", "often", "on", "once", "one", "only", "onto", "or", "other", "others", "otherwise", "our", "ours", "ourselves", "out", "over", "own","part", "per", "perhaps", "please", "put", "rather", "re", "same", "see", "seem", "seemed", "seeming", "seems", "serious", "several", "she", "should", "show", "side", "since", "sincere", "six", "sixty", "so", "some", "somehow", "someone", "something", "sometime", "sometimes", "somewhere", "still", "such", "system", "take", "ten", "than", "that", "the", "their", "them", "themselves", "then", "thence", "there", "thereafter", "thereby", "therefore", "therein", "thereupon", "these", "they", "thickv", "thin", "third", "this", "those", "though", "three", "through", "throughout", "thru", "thus", "to", "together", "too", "top", "toward", "towards", "twelve", "twenty", "two", "un", "under", "until", "up", "upon", "us", "very", "via", "was", "we", "well", "were", "what", "whatever", "when", "whence", "whenever", "where", "whereafter", "whereas", "whereby", "wherein", "whereupon", "wherever", "whether", "which", "while", "whither", "who", "whoever", "whole", "whom", "whose", "why", "will", "with", "within", "without", "would", "yet", "you", "your", "yours", "yourself", "yourselves", "the");

Updated October 3d, 2009.

This is the stop words list used by MySQL FullText feature

a’s, able, about, above, according, accordingly, across, actually, after, afterwards, again, against, ain’t, all, allow, allows, almost, alone, along, already, also, although, always, am, among, amongst, an, and, another, any, anybody, anyhow, anyone, anything, anyway, anyways, anywhere, apart, appear, appreciate, appropriate, are, aren’t, around, as, aside, ask, asking, associated, at, available, away, awfully, be, became, because, become, becomes, becoming, been, before, beforehand, behind, being, believe, below, beside, besides, best, better, between, beyond, both, brief, but, by, c’mon, c’s, came, can, can’t, cannot, cant, cause, causes, certain, certainly, changes, clearly, co, com, come, comes, concerning, consequently, consider, considering, contain, containing, contains, corresponding, could, couldn’t, course, currently, definitely, described, despite, did, didn’t, different, do, does, doesn’t, doing, don’t, done, down, downwards, during, each, edu, eg, eight, either, else, elsewhere, enough, entirely, especially, et, etc, even, ever, every, everybody, everyone, everything, everywhere, ex, exactly, example, except, far, few, fifth, first, five, followed, following, follows, for, former, formerly, forth, four, from, further, furthermore, get, gets, getting, given, gives, go, goes, going, gone, got, gotten, greetings, had, hadn’t, happens, hardly, has, hasn’t, have, haven’t, having, he, he’s, hello, help, hence, her, here, here’s, hereafter, hereby, herein, hereupon, hers, herself, hi, him, himself, his, hither, hopefully, how, howbeit, however, i’d, i’ll, i’m, i’ve, ie, if, ignored, immediate, in, inasmuch, inc, indeed, indicate, indicated, indicates, inner, insofar, instead, into, inward, is, isn’t, it, it’d, it’ll, it’s, its, itself, just, keep, keeps, kept, know, knows, known, last, lately, later, latter, latterly, least, less, lest, let, let’s, like, liked, likely, little, look, looking, looks, ltd, mainly, many, may, maybe, me, mean, meanwhile, merely, might, more, moreover, most, mostly, much, must, my, myself, name, namely, nd, near, nearly, necessary, need, needs, neither, never, nevertheless, new, next, nine, no, nobody, non, none, noone, nor, normally, not, nothing, novel, now, nowhere, obviously, of, off, often, oh, ok, okay, old, on, once, one, ones, only, onto, or, other, others, otherwise, ought, our, ours, ourselves, out, outside, over, overall, own, particular, particularly, per, perhaps, placed, please, plus, possible, presumably, probably, provides, que, quite, qv, rather, rd, re, really, reasonably, regarding, regardless, regards, relatively, respectively, right, said, same, saw, say, saying, says, second, secondly, see, seeing, seem, seemed, seeming, seems, seen, self, selves, sensible, sent, serious, seriously, seven, several, shall, she, should, shouldn’t, since, six, so, some, somebody, somehow, someone, something, sometime, sometimes, somewhat, somewhere, soon, sorry, specified, specify, specifying, still, sub, such, sup, sure, t’s, take, taken, tell, tends, th, than, thank, thanks, thanx, that, that’s, thats, the, their, theirs, them, themselves, then, thence, there, there’s, thereafter, thereby, therefore, therein, theres, thereupon, these, they, they’d, they’ll, they’re, they’ve, think, third, this, thorough, thoroughly, those, though, three, through, throughout, thru, thus, to, together, too, took, toward, towards, tried, tries, truly, try, trying, twice, two, un, under, unfortunately, unless, unlikely, until, unto, up, upon, us, use, used, useful, uses, using, usually, value, various, very, via, viz, vs, want, wants, was, wasn’t, way, we, we’d, we’ll, we’re, we’ve, welcome, well, went, were, weren’t, what, what’s, whatever, when, whence, whenever, where, where’s, whereafter, whereas, whereby, wherein, whereupon, wherever, whether, which, while, whither, who, who’s, whoever, whole, whom, whose, why, will, willing, wish, with, within, without, won’t, wonder, would, would, wouldn’t, yes, yet, you, you’d, you’ll, you’re, you’ve, your, yours, yourself, yourselves, zero

CSV Format

a,able,about,across,after,all,almost,also,am,among,an,and,any,are,as,at,be,because,been,but,by,can,cannot,could,dear,did,do,does,either,else,ever,every,for,from,get,got,had,has,have,he,her,hers,him,his,how,however,i,if,in,into,is,it,its,just,least,let,like,likely,may,me,might,most,must,my,neither,no,nor,not,of,off,often,on,only,or,other,our,own,rather,said,say,says,she,should,since,so,some,than,that,the,their,them,then,there,these,they,this,tis,to,too,twas,us,wants,was,we,were,what,when,where,which,while,who,whom,why,will,with,would,yet,you,your

I have also created another article where you can download stop words in csv, txt or as a php file